Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition
Reexamination Certificate
2008-05-06
2008-05-06
Dorvil, Richemond (Department: 2626)
Data processing: speech signal processing, linguistics, language
Speech signal processing
Recognition
C704S270000
Reexamination Certificate
active
07369992
ABSTRACT:
A system and method for generating a video sequence having mouth movements synchronized with speech sounds are disclosed. The system utilizes a database of n-phones as the smallest selectable unit, wherein n is larger than 1 and preferably 3. The system calculates a target cost for each candidate n-phone for a target frame using a phonetic distance, coarticulation parameter, and speech rate. For each n-phone in a target sequence, the system searches for candidate n-phones that are visually similar according to the target cost. The system samples each candidate n-phone to get a same number of frames as in the target sequence and builds a video frame lattice of candidate video frames. The system assigns a joint cost to each pair of adjacent frames and searches the video frame lattice to construct the video sequence by finding the optimal path through the lattice according to the minimum of the sum of the target cost and the joint cost over the sequence.
REFERENCES:
patent: 5278943 (1994-01-01), Gasper et al.
patent: 5384893 (1995-01-01), Hutchins
patent: 5502790 (1996-03-01), Yi
patent: 5657426 (1997-08-01), Waters et al.
patent: 5826234 (1998-10-01), Lyberg
patent: 5880788 (1999-03-01), Bregler
patent: 125949 (1999-08-01), Okutani et al.
patent: 6006175 (1999-12-01), Holzrichter
patent: 6112177 (2000-08-01), Cosatto et al.
patent: 137537 (2002-03-01), Guo et al.
patent: 6385580 (2002-05-01), Lyberg et al.
patent: 6449595 (2002-09-01), Arslan et al.
patent: 6539354 (2003-03-01), Sutton et al.
patent: 6654018 (2003-11-01), Cosatto et al.
patent: 6662161 (2003-12-01), Cosatto et al.
patent: 6735566 (2004-05-01), Brand
patent: 6778252 (2004-08-01), Moulton et al.
patent: 6839672 (2005-01-01), Beutnagel et al.
patent: 2003/0137537 (2003-07-01), Guo et al.
patent: 2250405 (1992-06-01), None
X.D. Huang et al., “Spoken Language Processing”, Prentice Hall, 2001, pp. 804-818.
E. Cosatto et al., “Photo-Realistic Talking Heads”, IEEE Trans. on Multimedia, vol. 2, No. 3, Sep. 2000.
M. Beutnagel et al., “Rapid Unit Selection from a Large Speech Corpus for Concatenative Speech Synthesis”, Eurospeech, 1999.
A. Hunt et al., Unit Selection in a Concatenative Speech Synthesis System Using a Large Speech Database, ICASSP, 1996.
A. Black et al., Optimizing Selection of Units from Speech Databases for Concatenative Synthesis, Eurospeech, 1995.
M. Cohen et al., “Modeling Coarticulation in Synthetic Visual Speech”, In Models and Techniques in Computer Animation, Springer-Verlag, 1993.
S. Dupont et al., “Audio-Visual Speech Modeling for Continuous Speech Recognition”, IEEE Transactions on Multimedia, vol. 2, No. 3, Sep. 2000.
Cosatto Eric
Graf Hans Peter
Huang Fu Jie
AT&T Corp.
Dorvil Richemond
Sked Matthew J.
LandOfFree
System and method for triphone-based unit selection for... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with System and method for triphone-based unit selection for..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System and method for triphone-based unit selection for... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2815921