System and method for real time lip synchronization

Image analysis – Applications

Reexamination Certificate

Rate now

[ 0.00 ] – not rated yet Voters 0 Comments 0

Details System and method for real time lip synchronization System and method for real time lip synchronization

: 2006-11-07
: 2006-11-07
: Wu, Jingge (Department: 2624)
: Image analysis
: Applications

: C382S118000, C704S235000, C704S256000, C704S258000, C704S260000, C704S270000
: Reexamination Certificate
: active
: 07133535
: ABSTRACT:
A novel method for synchronizing the lips of a sketched face to an input voice. The lip synchronization system and method approach is to use training video as much as possible when the input voice is similar to the training voice sequences. Initially, face sequences are clustered from video segments, then by making use of sub-sequence Hidden Markov Models, a correlation between speech signals and face shape sequences is built. From this re-use of video, the discontinuity between two consecutive output faces is decreased and accurate and realistic synthesized animations are obtained. The lip synchronization system and method can synthesize faces from input audio in real-time without noticeable delay. Since acoustic feature data calculated from audio is directly used to drive the system without considering its phonemic representation, the method can adapt to any kind of voice, language or sound.

REFERENCES:
patent: 5880788 (1999-03-01), Bregler
patent: 5907351 (1999-05-01), Chen et al.
patent: 5933151 (1999-08-01), Jayant et al.
patent: 6366885 (2002-04-01), Basu et al.
patent: 6449595 (2002-09-01), Arslan et al.
patent: 6735566 (2004-05-01), Brand
patent: 6813607 (2004-11-01), Faruquie et al.
patent: 6919892 (2005-07-01), Cheiky et al.
patent: 2002/0007276 (2002-01-01), Rosenblatt et al.
patent: 2002/0152074 (2002-10-01), Junqua
patent: 2004/0122675 (2004-06-01), Nefian et al.
Kakihara et al. “Speech-to-Face Movement Synthesis Based on HMMS”, Aug. 2, 2000, 2000 IEEE International Conference on Multimedia And Expo, 2000, ICME 2000, vol. 1, pp. 427-430.
Huang et al. “Real-Time Lip-Synch Face Animation Driven by Human Voice”, Dec. 9, 1998, 1998 IEEE Second Workshops on Multimedia Signal Processing, pp. 352-357.
Nakamura et al. “Speech-to-Lip Movements Synthesis Maximizing Audio-Visual Joint Probability Based on EM Algorithm”, Dec. 9, 1998, 1998 IEEE Second Workshops on Multimedia Signal Processing, pp. 53-58.
Masuko et al. “Text-to-Visual Speech Synthesis Based on Parameter Generation from HMM”, May 15, 1998, Proceedings of the 1998 IEEE International Conference on Acoustic, Speech, and Signal Processing, 1998 ICASSP '98, vol. 6, pp. 3745-3748.
Jones et al. “Automated Lip Synchronisation for Human-Computer Interaction and Special Effect Animation”, Jun. 6, 1997, IEEE International Conference on Multimedia Computing and Systems, pp. 589-596.
Curinga, “Use of Statistical Model for Lip Synthesis”, Sep. 10, 1998, 1998 IEEE International Conference on Electronics, Circuits and Systems, vol. 3, pp. 539-541.
Brooke et al. “Computer Graphics Animations of Talking Faces Based on Stochastic Models”, 1994 International Symposium on Speech, Image Processing and Neural Networks, 1994, Proceedings, ISSIPNN '94, vol. 1, pp. 73-76.
Choi et al. “Baum-Welch Hidden Markov Model Inversion for Reliable Audio-to-Visual Conversion”, 1999 IEEE 3rd Workshop on Multimedia Signal Processing, pp. 175-180.
Williams et al. “An HMM-Based Speech-to-Video Synthesizer”, Jul. 2002, IEEE Transactions on Neural Networks, vol. 13, Issue 4, pp. 900-915.
Brand, M., Voice puppetry,Proc. ACM SIGGRAPH'99, 1999.
Bregler, C., M. Covell, and M. Slaney, Video rewrite: Driving visual speech with audio,Proc. ACM SIGGRAPH'97, 1997.
Chang, E., J. Zhou, S. Di, C. Huang, and K. Lee, Large vocabulary Mandarin speech recognition with difference approaches in modeling tones,International Conference on Spoken Language Processing, Beijing, Oct. 16-20, 2000.
Covell, M., and C. Bregler, Eigenpoints,Proc. Int. Conf. Image Processing, Lausanne, Switzerland, Col. 3, pp. 471-474, 1996.
Curinga, S., R. Pockaj, F. Vignoli, C. Braccini, and F. Lavagetto, Application of synthetic lip motion to hybrid video coding,Int. Workshop on Synthetic Natural Hybrid Coding and 3D Imaging(IWSNHC3DI'97), Sep. 5-9, 1997, Rodhes, pp. 187-191.
Curinga, S., F. Lavagetto, and F. Vignoli, Lip movement synthesis using time delay neural networks,Proc. EUSIPCO'96, 1996.
Davis, S. B. and P. Mermelstein, Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences,IEEE Transactions on Acoustics, Speech and Signal Processing, Aug. 1980, ASSP-28:57-366.
Goff, B. Le, T. Guiard-Marigny, M Cohen, and C. Benoit, Real-time analysis-synthesis and intelligibility of talking faces,2ndInternational Conference on Speech Synthesis, Newark (NY), Sep. 1994.
Huang, Fu Jie and Tsuhan Chen, Real-time lip-synch face animation driven by human voice,IEEE Multimedia Signal Processing Workshop, Los Angeles, California, 1998.
Kshirsagar, S. and N. Magnenat-Thalmann, Lip synchronization using linear predictive analysis,Proceedings of the IEEE International Conference on Multimedia and Expo, New York, Aug. 2000.
Lavagetto, F., Converting speech into lip movements: a multimedia telephone for hard of hearing people,IEEE Transactions in Rehabilitation Engineering, vol. 3, No. 1, 1995, pp. 90-102.
Morishima, S., and H. Harashima, A media conversion from speech to facial image for intelligent man-machine interface,IEEE Journal on Selected Area in Communications, 9(4), 1991.
Picone, J., Signal modeling techniques in speech recognition,Proceedings of the IEEE, 1993.
Rabiner, L., A tutorial on hidden Markov models and selected applications on speech recognition,Proceedings of the IEEE, 1989, 77(2):255-286.
Rabiner, L. R., J. G. Wilpon, and F. K. Soong, High performance connected digit recognition using hidden markov models,IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 37, No. 8, Aug. 1989.
Rao, R., and T. Chen, Using HMM's for audio-to-visual conversion,IEEE'97 Workshop on Multimedia Signal Processing, 1997.
Yamamoto, E., S. Nakamura and K. Shikano, Lip movement synthesis from speech based on hidden Markov models,Proc. Int. Conf. on Automatic Face and Gesture Recognition, FG '98, Nara, Japan, 1998, IEEE Computer Society, pp. 154-159.

Affiliated with

Guo Baining

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Huang Ying

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Lin Stephen Ssu-te

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Shum Heung-Yeung

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Also associated with

Carter Aaron

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

Lyon Katrina A.

Attorney

[ 0.00 ] – not rated yet Voters 0 Comments 0

Lyon&Harr, LLP

Law Firm

[ 0.00 ] – not rated yet Voters 0 Comments 0

Microsoft Corp.

Corporate Assignee

[ 0.00 ] – not rated yet Voters 0 Comments 0

Wu Jingge

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

System and method for real time lip synchronization does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with System and method for real time lip synchronization, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System and method for real time lip synchronization will most certainly appreciate the feedback.

Rate now

Comments { 0 }

Profile ID: LFUS-PAI-O-3687455

All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.

Canada

Charities
Companies
MP Candidates
Patents
Employee Salary Disclosure

World

Places of the World
Scientific Papers

United States

Banks
Companies
Counties
Patents
Employee Salary Disclosure