Coupled hidden Markov model for audiovisual speech recognition

Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S256100, C704S256200

Reexamination Certificate

active

10142468

ABSTRACT:
A speech recognition method includes use of synchronous or asynchronous audio and a video data to enhance speech recognition probabilities. A two stream coupled hidden Markov model is trained and used to identify speech. At least one stream is derived from audio data and a second stream is derived from mouth pattern data. Gestural or other suitable data streams can optionally be combined to reduce speech recognition error rates in noisy environments.

REFERENCES:
patent: 5454043 (1995-09-01), Freeman
patent: 5596362 (1997-01-01), Zhou
patent: 5710590 (1998-01-01), Ichige et al.
patent: 5754695 (1998-05-01), Kuo et al.
patent: 5850470 (1998-12-01), Kung et al.
patent: 5887069 (1999-03-01), Sakou et al.
patent: 6024852 (2000-02-01), Tamura et al.
patent: 6072494 (2000-06-01), Nguyen
patent: 6075895 (2000-06-01), Qiao et al.
patent: 6108005 (2000-08-01), Starks et al.
patent: 6128003 (2000-10-01), Smith et al.
patent: 6184926 (2001-02-01), Khosravi et al.
patent: 6185529 (2001-02-01), Chen et al.
patent: 6191773 (2001-02-01), Maruno et al.
patent: 6212510 (2001-04-01), Brand
patent: 6215890 (2001-04-01), Matsuo et al.
patent: 6219639 (2001-04-01), Bakis et al.
patent: 6222465 (2001-04-01), Kumar et al.
patent: 6304674 (2001-10-01), Cass et al.
patent: 6335977 (2002-01-01), Kage
patent: 6385331 (2002-05-01), Harakawa et al.
patent: 6594629 (2003-07-01), Basu et al.
patent: 6609093 (2003-08-01), Gopinath et al.
patent: 6624833 (2003-09-01), Kumar et al.
patent: 6633844 (2003-10-01), Verma et al.
patent: 6678415 (2004-01-01), Popat et al.
patent: 6751354 (2004-06-01), Foote et al.
patent: 6816836 (2004-11-01), Basu et al.
patent: 6952687 (2005-10-01), Andersen et al.
patent: 6964023 (2005-11-01), Maes et al.
patent: 2002/0036617 (2002-03-01), Pryor
patent: 2002/0093666 (2002-07-01), Foote et al.
patent: 2002/0102010 (2002-08-01), Liu et al.
patent: 2002/0135618 (2002-09-01), Maes et al.
patent: 2002/0140718 (2002-10-01), Yan et al.
patent: 2002/0161582 (2002-10-01), Basson et al.
patent: 2003/0123754 (2003-07-01), Toyama
patent: 2003/0144844 (2003-07-01), Colmenarez et al.
patent: 2003/0154084 (2003-08-01), Li et al.
patent: 2003/0171932 (2003-09-01), Juang et al.
patent: 2003/0190076 (2003-10-01), DeLean
patent: 2112273 (1995-08-01), None
patent: 2093890 (1995-09-01), None
patent: 2093890 (1997-10-01), None
patent: 2112273 (1998-05-01), None
patent: WO 00/36845 (2000-06-01), None
Vladimir Ivan Pavlovic, “Dynamic Bayesian Networks for Information Fusion with Applications to Human-Computer Interfaces,” Thesis, University of Urbana-Champaign, 1999, pp. iii to ix and 63 to 81.
Rezek et al., “Coupled hidden Markov models for biosignal interaction,” Advances in Medical Signal and Information Processing, Sep. 4-6, 2000, pp. 54 to 59.
Fu et al., “Audio-visual speaker identification using coupled hidden Markov models,” 2003 International Conference on Image Processing, ICIP 2003, Sep. 14-17, 2003, vol. 2, pp. 29 to 32.
Nefian et al., “A coupled HMM for audio-visual speech recognition,” Proceeding IEEE International Conference on Acoustics, Speech, and Signal Processing, May 13-17, 2002, vol. 2, pp. 2013-2016.
Pavlovic, “Dynamic Bayesian Networks for Information Fusion with Application to Human-Computer Interfaces,” Thesis, University of Urbana-Champaign, 1999, pp. iii to v, 24 to 25, 29, 35, 59 to 61, and 63 to 81.
Wikipedia, definition of “Hidden Markov model,” 3 Pages.
Wikipedia, definition of “Viterbi algorithm,”5 Pages.
Rezek et al., “Learning interaction dynamics with coupled hidden Markov models,” IEE Proceedings—Science, Measurement and Technology, Nov. 2000, vol. 147, Issue 6, pp. 345 to 350.
Kristjansson et al., “Event-coupled hidden Markov models,” 2000 IEEE International Conference on Multimedia and Expo, 2000, Jul. 30 to Aug. 2, 2000, vol. 1, pp. 385 to 388.
Pavlovic, “Multimodal tracking and classification of audio-visual features,” 1998 International Conference on Image Processing, 1998. ICIP 98 Proceedings, Oct. 4-7, 1998, vol. 1, pp. 343 to 347.
Dupont et al: Audio-Visual Speech Modeling for Continuous Speech Recognition; Sep. 2000 IEEE Transactions on Multimedia, vol. 2 No. 3; pp. 141-151.
Potamianos et al: An Image Transform Approach for HMM Based Automatic Lipreading; 1998 IEEE; pp. 173-177.
Potamianos et al: Linear Discriminant Analysis for Speechreading; At&T Labs 0 Research, Florham Park, NJ; 6 pages.
Chan: HMM-Based Audio-Visual Speech Recognition Integrating Geometric-and Appearance-Based Visual Features; 2001 IEEE; Rockwell Scientific Company, CA; pp.
Hennecke et al: Automatic Speech Recognition System Using Acoustic and Visual Signals; 1996 IEEE Proceedings of ASILOMAR-29; pp. 1214-1218.
Dugad: Tutorial on Hidden Markov Models; Technical Report No.: SPANN-96, May 1996, pp. 1-16.
Brand: Coupled Hidden Markov Models for Modeling Interacting Processes; Learning and Common Sense Technical Report 405, Jun. 3, 1997, MIT Media Lab Perceptual Computing, USA, pp. 1-28.
Nefian et al: An Embedded HMM-Based Approach for Face Detection and Recognition; Proceedings of the IEEE Int'l Conference on Acousincs, Speech and Signal Processing, Mar. 15-19, 1999; IEEE, Mar. 15, 1999, pp. 3553-3556, USA.
Nefian: Embedded Bayesian Networks for Face Recognition; IEEE In'tl Conference on Multimedia and Expo; IEEE vol. 2, Aug. 26, 2002, pp. 133-136.
Kennedy, et al: Identification of Coupled Markov Chain Model with Application; Proceedings of the 31st IEEE Conference on Decision and Control, Dec. 16-18, 1992; vol. 4, pp. 3529-3534.
Ramesh, et al: Automatic Selection of Tuning Parameters for Feature Extraction Sequences; Proceedings IEEE Computer Society Conference on Computer vision and Pattern Recognition; Jun. 21-23, 1994, pp. 672-677.
Liang, et al: Speaker Independent Audio-Visual Continuous Speech Recognition; Aug. 2002; Multimedia and Expo, vol. 2, pp. 25-28; IEEE.
Logan et al: Factorial Hidden Markov Models for Speech Recognition: Preliminary Experiments; Cambridge Research Laboratory, UK, Technical Report Series CFL 97/7, Sep. 1997, 20 pages.
Pending U.S. Appl. No.: 10/326,368; Office Action dated Jul. 25, 2006.
Luettin et al.: Asynchronous Stream Modelling for Large Vocabulary Audio-Visual Speech Recognition, Proceedings of the 2001 IEEE Int'l Conference of Acoustics, Speech and Signal Processing (ICASSP'01), May 7-11, 2001, pp. 169-172.
Gordan: A Temporal Network for Support Vector Machine Classifiers for the Recognition of Visual Speech, Methods and Applications of Artifical Intelligence: Proceedings of the 2nd bellenic Conference on A1 (SETN 2002), Thessaloniki, Greece, Apr. 11-12, 2002, pp. 355-365.
Ming-Husan Yang et al.: Detecting Faces in Images: A Survey; IEEE trans Pattern Analysis and Machine Intelligence, vol. 24, Jan. 2002, pp. 34-58.
Yongmin Li et al.: Multi-view Face Detection Using Support Vector Machines and Eigenspace Modelling , Proceedings on the Int'l Conference on Knowledge-based Intelligent Engineering Systems and.
Batra: Modeling and Effcient Optimization for Object-Based Scalability and Some Related Problems, IEEE Transactions onImage processing, vol. 9, No. 10, Oct. 10, 2000, pp. 1677-1692.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Coupled hidden Markov model for audiovisual speech recognition does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Coupled hidden Markov model for audiovisual speech recognition, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Coupled hidden Markov model for audiovisual speech recognition will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3805139

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.