Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition
Reexamination Certificate
2002-12-19
2008-12-30
Smits, Talivaldis Ivars (Department: 2626)
Data processing: speech signal processing, linguistics, language
Speech signal processing
Recognition
C704S236000, C382S227000
Reexamination Certificate
active
07472063
ABSTRACT:
A speech recognition method includes several embodiments describing application of support vector machine analysis to a mouth region. Lip position can be accurately determined and used in conjunction with synchronous or asynchronous audio data to enhance speech recognition probabilities.
REFERENCES:
patent: 5454043 (1995-09-01), Freeman
patent: 5596362 (1997-01-01), Zhou
patent: 5710590 (1998-01-01), Ichige et al.
patent: 5754695 (1998-05-01), Kuo et al.
patent: 5850470 (1998-12-01), Kung et al.
patent: 5887069 (1999-03-01), Sakou et al.
patent: 6024852 (2000-02-01), Tamura et al.
patent: 6072494 (2000-06-01), Nguyen
patent: 6075895 (2000-06-01), Qiao et al.
patent: 6108005 (2000-08-01), Starks et al.
patent: 6128003 (2000-10-01), Smith et al.
patent: 6184926 (2001-02-01), Khosravi et al.
patent: 6185529 (2001-02-01), Chen et al.
patent: 6191773 (2001-02-01), Maruno et al.
patent: 6212510 (2001-04-01), Brand
patent: 6215890 (2001-04-01), Matsuo et al.
patent: 6219639 (2001-04-01), Bakis et al.
patent: 6222465 (2001-04-01), Kumar et al.
patent: 6304674 (2001-10-01), Cass et al.
patent: 6335977 (2002-01-01), Kage
patent: 6385331 (2002-05-01), Harakawa et al.
patent: 6594629 (2003-07-01), Basu et al.
patent: 6609093 (2003-08-01), Gopinath et al.
patent: 6624833 (2003-09-01), Kumar et al.
patent: 6633844 (2003-10-01), Verma et al.
patent: 6678415 (2004-01-01), Popat et al.
patent: 6751354 (2004-06-01), Foote et al.
patent: 6816836 (2004-11-01), Basu et al.
patent: 6952687 (2005-10-01), Andersen et al.
patent: 6996549 (2006-02-01), Zhang et al.
patent: 2002/0031262 (2002-03-01), Imagawa et al.
patent: 2002/0036617 (2002-03-01), Pryor
patent: 2002/0093666 (2002-07-01), Foote et al.
patent: 2002/0102010 (2002-08-01), Liu et al.
patent: 2002/0135618 (2002-09-01), Maes et al.
patent: 2002/0140718 (2002-10-01), Yan et al.
patent: 2002/0161582 (2002-10-01), Basson et al.
patent: 2003/0110038 (2003-06-01), Sharma et al.
patent: 2003/0123754 (2003-07-01), Toyama
patent: 2003/0144844 (2003-07-01), Colmenarez et al.
patent: 2003/0154084 (2003-08-01), Li et al.
patent: 2003/0171932 (2003-09-01), Juang et al.
patent: 2003/0190076 (2003-10-01), DeLean
patent: 2003/0212552 (2003-11-01), Liang
patent: 2112273 (1995-08-01), None
patent: 2093890 (1997-10-01), None
patent: WO 00/36845 (2000-06-01), None
patent: WO 03/009218 (2003-01-01), None
Juergen Luettin, Gerasimos Potamianos, and Chalapathy Neti, “Asynchronous Stream Modelling for Large Vocabulary Audio-Visual Speech Recognition,” Proc. 2001 IEEE Int. Conf. on Acoust., Speech, and Sig. Proc. (ICASSP'01), May 7-11, 2001, pp. 169-172.
Mihaela Gordan, Constantine Kotropoulos, and Ioannis Pitas, “A Temporal Network fo Support Vector Machine Classifiers for the Recognition of Visual Speech,” Meth. and Appl. of Artificial Intelligence: Proc. 2nd Hellenic Conf. on AI (SETN 2002), Thessaloniki, Greece, Apr. 11-12, 2002, p. 355-365.
Ming-Hsuan Yang, david J. Kriegman,a dn Narendra Ahuja, “Detecting Faces in Images: A Survey,” IEEE trans Pattern Analysis and Machine Intelligence, vol. 24, No. 1, Jan. 2002, pp. 34-58.
Yongmin Li, Shaogang gong, Jamie sherrah, and Heather Liddell, “Multi-view face Detection Using Support Vector Machines and Eigenspace Modelling,” Proc. Int. Conf. on Knowledge-based Intelligent Engineering Systems and Allied Technologies, Brighton UK, Sep. 2000, pp. 241-244.
Pankaj Batra, “Modeling and Efficient Optimization for Object-Based Scalability and Some related Problems,” IEEE Trans. Image Proc., vol. 9, No. 10, Oct. 2000, pp. 1677-1692.
Hennecke, et al: Automatic Speech Recognition System Using Acoustic and Visual Signals, IEEE, 1996.
Dupont et al: Audio-Visual Speech Modeling for Continuous Speech Recognition, Sep. 2000, IEEE Transactions on Multimedia, vol. 2, No. 3, pp. 141-151.
Potamianos et al: An Image Transform Approach for HMM Based Automatic Lipreading, Proc. Int. conf. Image Processing, 1998.
Potamianos et al: Linear Discriminant Analysis for Speechreading; IEEE Workshop on Multimedia Processing, Dec. 1998.
Chan. HMM-Based Audio-Visual Speech Recognition Integrating Geometric and Appearance-Based Visual Features, IEEE 2001.
Pavlovic: Dynamic Bayesian Networks for Information Fusion with Applications to Human-Computer Interfaces; Thesis, University of Urbana-Champaign, 1999, pp. iii-ix and 63-81.
Rezek, et al: Coupled Hidden Markov Models for Biosignal Interaction; Advances in Medical Signal and Information Processing, Sep. 4-6, 2000; pp. 54-59.
Fu, et al: Audio-Visual Speaker Identification Using Coupled Hidden Markov Models; 2003 Int'l Conference on Image Processing (ICIP), Sep. 14-17, 2003; vol. 2, pp. 29-32.
Nefian, et al: A Coupled HMM for Audio-Visual Speech Recognition; Proceeding IEEE Int'l Conference on Acousitics, Speech, and Signal Processing, vol. 3 of 4, May 13-17, 2002, pp. 2013-2016.
Kristjansson, et al: Event-Coupled Hidden Markov Models; 2000 IEEE Int'l Conference on Multimedia and Expo, Jul 30-Aug. 2, 2000; vol. 1; pp. 385-388.
Pavlovic: Multimodal Tracking and Classification of Audio-Visual Features; 1998 Int'l Conference on Image Processing, ICIP Proceedings; Oct. 4-7, 1998, vol. 1; pp. 343-347.
Wikipedia, definition of Hidden Markov Model, 3 pages.
Wikipedia, definition of Viterbi Algorithm, 5 pages.
Rezek, et al: Learning Interaction Dynamics with Coupled Hidden Markov Models; IEEE Proceedings—Science, Measurement and Technology, Nov. 2000; vol. 147, Issue 6; pp. 345-350.
Logan et al: Factorial Hidden Markov Models for Speech Recognition: Preliminary Experiments; Cambridge Research Laboratory; Technical report Series; CRL 97/7; Sep. 1997.
Dugad: Tutorial on Hidden Markov Models; Technical Report No.: SPANN-96, May 1996, pp. 1-16.
Brand: Coupled Hidden Markov Models for Modeling Interacting Processes; Learning and Common Sense Technical Report 405, Jun. 3, 1997, MIT Media Lab Perceptual Computing, USA, pp. 1-28.
Nefian et al: An Embedded HMM-Based Approach for Face Detection and Recognition; Proceedings of the IEEE Int'l Conference on Acousincs, Speech and Signal Processing, Mar. 15-19, 1999; IEEE, Mar. 15, 1999, pp. 3553-3556, USA.
Nefian: Embedded Bayesian Networks for Face Recognition; IEEE In'tl Conference on Multimedia and Expo; IEEE vol. 2, Aug. 26, 2002, pp. 133-136.
Kennedy, et al: Identification of Coupled Markov Chain Model with Application; Proceedings of the 31st IEEE Conference on Decision and Control, Dec. 16-18, 1992; vol. 4, pp. 3529-3534.
Ramesh, et al: Automatic Selection of Tuning Parameters for Feature Extraction Sequences; Proceedings IEEE Computer Society Conference on Computer vision and Pattern Recognition; Jun. 21-23, 1994, pp. 672-677.
Liang, et al: Speaker Independent Audio-Visual Continuous Speech Recognition; Aug. 2002; Multimedia and Expo, vol. 2, pp. 25-28; IEEE.
Int'l Application No.: PCT/US03/31454 Written Opinion dated Oct. 12, 2006.
Pending U.S. Appl. No. 10/143,459, filed May 9, 2002, inventor: Liang Office Action dated Oct. 31, 2006.
Wikipedia definition: Linear Discrimant Analysis.
Neti et al.:: Large-Vocabulary Audio-Visual Speech Recognition: A Summary of the Johns Hopkins Summer 2000 Workshop IEEE 2001 pp. 619-624.
Wikipedia, definition of Viterbi Algorithm, 5 pages, Feb. 8, 2006.
Wikipedia, definition of Hidden Markov Model, 3 pages, Feb. 11, 2006.
U.S. Appl. No. 10/142,468 filed May 9, 2002 Office Action dated Mar. 1, 2006.
U.S. Appl. No. 10/142,468 filed May 9, 2002 Office Action dated Aug. 2, 2005.
U.S. Appl. No. 10/269,333 filed Oct. 11, 2002 Office Action dated Jan. 20, 2006.
U.S. Appl. No. 10/269,381 filed Jan. 6, 2003 Office Action dated OA Mar. 3, 2006.
PCT/US 03/31454 Int'l Search Report dated Mar. 1, 2004.
Wikipedia: Definition of Linear Discriminant Analysis, Aug. 29, 2006.
U.S. Appl. No. 10/143,459 filed May 9, 2002 Office Action dated May 23, 2006.
U.S. Appl. No. 10/269,333 filed Oct. 11, 2002 Final Office Action dated May 16, 2006.
U.S. Appl. No. 10/142,447 filed May 9, 2002 Office Action dated May 17, 2006.
U.S. Patent and Trademark Office Official Action in related U.
Liang Luhong
Liu Xiaoxing
Nefian Ara V.
Pi Xiaobo
Zhao Yibao
Intel Corporation
Smits Talivaldis Ivars
Trop Pruner & Hu P.C.
LandOfFree
Audio-visual feature fusion and support vector machine... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Audio-visual feature fusion and support vector machine..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Audio-visual feature fusion and support vector machine... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-4027478