Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition
Reexamination Certificate
2007-04-24
2007-04-24
Hudspeth, David (Department: 2626)
Data processing: speech signal processing, linguistics, language
Speech signal processing
Recognition
C704S260000, C345S473000, C348S515000
Reexamination Certificate
active
10143717
ABSTRACT:
A system and method for generating a video sequence having mouth movements synchronized with speech sounds are disclosed. The system utilizes a database of n-phones as the smallest selectable unit, wherein n is larger than 1 and preferably 3. The system calculates a target cost for each candidate n-phone for a target frame using a phonetic distance, coarticulation parameter, and speech rate. For each n-phone in a target sequence, the system searches for candidate n-phones that are visually similar according to the target cost. The system samples each candidate n-phone to get a same number of frames as in the target sequence and builds a video frame lattice of candidate video frames. The system assigns a joint cost to each pair of adjacent frames and searches the video frame lattice to construct the video sequence by finding the optimal path through the lattice according to the minimum of the sum of the target cost and the joint cost over the sequence.
REFERENCES:
patent: 125949 (1872-08-01), Okutani et al.
patent: 137537 (1873-03-01), Guo et al.
patent: 5278943 (1994-01-01), Gasper et al.
patent: 5384893 (1995-01-01), Hutchins
patent: 5502790 (1996-03-01), Yi
patent: 5657426 (1997-08-01), Waters et al.
patent: 5826234 (1998-10-01), Lyberg
patent: 5880788 (1999-03-01), Bregler
patent: 6006175 (1999-12-01), Holzrichter
patent: 6385580 (2002-05-01), Lyberg et al.
patent: 6449595 (2002-09-01), Arslan et al.
patent: 6539354 (2003-03-01), Sutton et al.
patent: 6662161 (2003-12-01), Cosatto et al.
patent: 6735566 (2004-05-01), Brand
patent: 6778252 (2004-08-01), Moulton et al.
patent: 6839672 (2005-01-01), Beutnagel et al.
“Spoken Language Processing” by X. D. Huang, et al.,Prentice-Hall, 2001, pp. 804-818.
Photo-Realistic Talking Heads by E. Cosatto, et al.,IEEE Trans. on Multimedia, vol. 2, No. 3, Sep. 2000.
“Rapid Unit Selection from a Large Speech Corpus for Concatenative Speech Synthesis” by M. Beutnagel, et al.,Eurospeech, 1999.
“Unit Selection in a Concatenative Speech Synthesis System Using a Large Speech Database,” by A. Hunt, et al.,ICASSP, 1996.
“Optimizing Selection of Units from Speech Databases for Concatenative Synthesis” by A. Black, et al.,Eurospeech, 1995.
“Modeling Coarticulation in Synthetic Visual Speech,” by M. Cohen, et al., in Models and Techniques in Computer Animation,Springer-Verlag, 1993.
“Audio-Visual Speech Modeling for Continuous Speech Recognition” by S. Dupont, et al.,IEEE Transactions on Multimedia, vol. 2, No. 3, Sep. 2000.
Cosatto Eric
Graf Hans Peter
Huang Fu Jie
AT&T Corp.
Sked Matthew J
LandOfFree
System and method for triphone-based unit selection for... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with System and method for triphone-based unit selection for..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System and method for triphone-based unit selection for... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3758439