Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition
Reexamination Certificate
2011-04-26
2011-04-26
Sked, Matthew J (Department: 2626)
Data processing: speech signal processing, linguistics, language
Speech signal processing
Recognition
C704S258000, C704S270000, C345S473000, C382S100000, C382S118000
Reexamination Certificate
active
07933772
ABSTRACT:
A system and method for generating a video sequence having mouth movements synchronized with speech sounds are disclosed. The system utilizes a database of n-phones as the smallest selectable unit, wherein n is larger than 1 and preferably 3. The system calculates a target cost for each candidate n-phone for a target frame using a phonetic distance, coarticulation parameter, and speech rate. For each n-phone in a target sequence, the system searches for candidate n-phones that are visually similar according to the target cost. The system samples each candidate n-phone to get a same number of frames as in the target sequence and builds a video frame lattice of candidate video frames. The system assigns a joint cost to each pair of adjacent frames and searches the video frame lattice to construct the video sequence by finding the optimal path through the lattice according to the minimum of the sum of the target cost and the joint cost over the sequence.
REFERENCES:
patent: 5278943 (1994-01-01), Gasper et al.
patent: 5384893 (1995-01-01), Hutchins
patent: 5502790 (1996-03-01), Yi
patent: 5657426 (1997-08-01), Waters et al.
patent: 5826234 (1998-10-01), Lyberg
patent: 5880788 (1999-03-01), Bregler
patent: 5907351 (1999-05-01), Chen et al.
patent: 6006175 (1999-12-01), Holzrichter
patent: 6112177 (2000-08-01), Cosatto et al.
patent: 6366885 (2002-04-01), Basu et al.
patent: 6385580 (2002-05-01), Lyberg et al.
patent: 6449595 (2002-09-01), Arslan et al.
patent: 6539354 (2003-03-01), Sutton et al.
patent: 6654018 (2003-11-01), Cosatto et al.
patent: 6662161 (2003-12-01), Cosatto et al.
patent: 6735566 (2004-05-01), Brand
patent: 6778252 (2004-08-01), Moulton et al.
patent: 6839672 (2005-01-01), Beutnagel et al.
patent: 2003/0125949 (2003-07-01), Okutani et al.
patent: 2003/0137537 (2003-07-01), Guo et al.
patent: 2250405 (1992-06-01), None
Cosatto, E. “Photo-Realistic Talking-Heads From Image Samples,” IEEE Transactions on Multimedia, vol. 2, No. 3, Sep. 2000.
X.D. Huang et al., “Spoken Language Processing”, Prentice Hall, 2001, pp. 804-818.
E. Cosatto et al., “Photo-Realistic Talking Heads”, IEEE Trans. on Multimedia, vol. 2, No. 3, Sep. 2000.
M. Beutnagel et al., “Rapid Unit Selection from a Large Speech Corpus for Concatenative Speech Synthesis”, Eurospeech, 1999.
A. Hunt et al., Unit Selection in a Concatenative Speech Synthesis System Using a Large Speech Database, ICASSP, 1996.
A. Black et al., Optimizing Selection of Units from Speech Databases for Concatenative Synthesis, Eurospeech, 1995.
M. Cohen et al., “Modeling Coarticulation in Synthetic Visual Speech”, In Models and Techniques in Computer Animation, Springer-Verlag, 1993.
S. Dupont et al., “Audio-Visual Speech Modeling for Continuous Speech Recognition”, IEEE Transactions on Multimedia, vol. 2, No. 3, Sep. 2000.
Cosatto Eric
Graf Hans Peter
Huang Fu Jie
AT&T Intellectual Property II L.P.
Sked Matthew J
LandOfFree
System and method for triphone-based unit selection for... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with System and method for triphone-based unit selection for..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System and method for triphone-based unit selection for... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2729927