Factorial hidden markov model for audiovisual speech...

Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S251000, C704S256200, C704S244000, C382S115000, C382S118000

Reexamination Certificate

active

10142447

ABSTRACT:
A speech recognition method includes use of synchronous or asynchronous audio and a video data to enhance speech recognition probabilities. A two stream factorial hidden Markov model is trained and used to identify speech. At least one stream is derived from audio data and a second stream is derived from mouth pattern data. Gestural or other suitable data streams can optionally be combined to reduce speech recognition error rates in noisy environments.

REFERENCES:
patent: 5454043 (1995-09-01), Freeman
patent: 5596362 (1997-01-01), Zhou
patent: 5710590 (1998-01-01), Ichige et al.
patent: 5754695 (1998-05-01), Kuo et al.
patent: 5887069 (1999-03-01), Sakou et al.
patent: 6024852 (2000-02-01), Tamura et al.
patent: 6072494 (2000-06-01), Nguyen
patent: 6075895 (2000-06-01), Qiao et al.
patent: 6108005 (2000-08-01), Starks et al.
patent: 6128003 (2000-10-01), Smith et al.
patent: 6184926 (2001-02-01), Khosravi et al.
patent: 6185529 (2001-02-01), Chen et al.
patent: 6191773 (2001-02-01), Maruno et al.
patent: 6212510 (2001-04-01), Brand
patent: 6215890 (2001-04-01), Matsuo et al.
patent: 6222465 (2001-04-01), Kumar et al.
patent: 6304674 (2001-10-01), Cass et al.
patent: 6335977 (2002-01-01), Kage
patent: 6385331 (2002-05-01), Harakawa et al.
patent: 6609093 (2003-08-01), Gopinath et al.
patent: 6624833 (2003-09-01), Kumar et al.
patent: 6678415 (2004-01-01), Popat et al.
patent: 6751354 (2004-06-01), Foote et al.
patent: 6816836 (2004-11-01), Basu et al.
patent: 6952687 (2005-10-01), Andersen et al.
patent: 6964023 (2005-11-01), Maes et al.
patent: 2002/0102010 (2000-08-01), Liu et al.
patent: 2002/0140718 (2000-10-01), Yan et al.
patent: 2002/0036617 (2002-03-01), Pryor
patent: 2002/0093666 (2002-07-01), Foote et al.
patent: 2002/0135618 (2002-09-01), Maes et al.
patent: 2002/0161582 (2002-10-01), Basson et al.
patent: 2003/0123754 (2003-07-01), Toyama
patent: 2003/0144844 (2003-07-01), Colmenarez et al.
patent: 2003/0154084 (2003-08-01), Li et al.
patent: 2003/0171932 (2003-09-01), Juang
patent: 2003/0190076 (2003-10-01), DeLean
patent: WO 00/36845 (2000-06-01), None
patent: WO 00/009218 (2003-01-01), None
Dupont et al., Audio-Visual Speech Modeling for Continuous Speech Recognition, Sep. 2000, IEEE Transactions on Multimedia, vol. 2, No. 3, pp. 141-151.
Potamianos et al., An Image Transform Approach for HMM Based Automatic Lipreading, Proc. Int. Conf. Image Processing, 1998.
Potamainos et al., IEEE Workshop on Multimedia Processing, Dec. 1998.
Chan, HHH-Based Audio-Visual Speech Recognition Integrating Geometric and Appearance-Based Visual Features, IEEE 2001.
Hennecke et al., Automatic Speech Recognition System Using Acoustic and Visual Signals, IEEE 1996.
Logan et al, “Factorial Hidden Markov Models for Speech Recognition”, Cambridge Research Laboratory, Sep. 1997, pp. 1-14.
Neti et al, “Large-vocabulary Audio-visual Speech Recognition: A Summary of the Johns Hopkins Summer 2000 Workshop,” In Proc. IEEE Workshop Multimedia Signal Processing, pp. 619-624, Cannes, France, 2001.
Dugad: Tutorial on Hidden Markov Models; Technical Report No. SPANN-96, May 1996, pp. 1-16.
Brand: Coupled Hidden Markov Models for Modeling Interacting Processes; Learning and Common Sense Technical Report 405, Jun. 3, 1997, MIT Media Lab Perceptual Computing, USA, pp. 1-28.
Nefian et al: An Embedded HMM-Based Approach for Face Detection and Recognition; Proceedings of the IEEE Int'l Conference on Acousincs, Speech and Signal Processing, Mar. 15-19, 1999, IEEE, Mar. 15, 1999, pp. 3553-3556, USA.
Nefian: Embedded Bayesian Networks for Face Recognition; IEEE In'tl Conference on Multimedia and Expo; IEEE vol. 2, Aug. 26, 2002, pp. 133-136.
Kennedy, et al: Identification of Coupled Markov Chain Model with Application; Proceedings of the 31st IEEE Conference on Decision and Control, Dec. 16-18, 1992; vol. 4, pp. 3529-3534.
Ramesh, et al: Automatic Selection of Tuning Parameters for Feature Extraction Sequences; Proceedings IEEE Computer Society Conference on Computer vision and Pattern Recognition; Jun. 21-23, 1994, pp. 672-677.
Liang, et al: Speaker Independent Audio-Visual Continuous Speech Recognition; Aug. 2002; Multimedia and Expo, vol. 2, pp. 25-28; IEEE.
PCT/US 03/31454 Int'l Search Report dated Mar. 1, 2004.
U.S. Appl. No. 10/269,333, filed Oct. 11, 2002 Office Action dated Jan. 20, 2006.
U.S. Appl. No. 10/269,381, filed Jan. 6, 2003 Office Action dated OA Mar. 3, 2006.
U.S. Appl. No. 10/269,333, filed Oct. 11, 2002 Final Office Action dated May 16, 2006.
U.S. Appl. No. 10/143,459, filed May 9, 2002 Office Action dated May 23, 2006.
U.S. Appl. No. 10/269,381, filed Jan. 6, 2003 Final Office Action dated Jul. 11, 2006.
U.S. Appl. No. 10/142,468, filed May 9, 2002 Office Action dated Aug. 2, 2005.
U.S. Appl. No. 10/142,468, filed May 9, 2002 Office Action dated Mar. 1, 2006.
Int'l Application No.: PCT/US03/31454 Written Opinion dated Oct. 12, 2006.
Pending U.S. Appl. No.: 10/143,459 filed May, 9 2002, inventor: Lu Hong Liang; Final Office Action dated Oct. 31, 2006.
Wikipedia: Definition of Linear Discriminant Analysis.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Factorial hidden markov model for audiovisual speech... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Factorial hidden markov model for audiovisual speech..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Factorial hidden markov model for audiovisual speech... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3786298

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.