Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition
Reexamination Certificate
2000-09-21
2003-10-14
McFadden, Susan (Department: 2655)
Data processing: speech signal processing, linguistics, language
Speech signal processing
Recognition
C704S256000
Reexamination Certificate
active
06633842
ABSTRACT:
FIELD OF INVENTION
This invention relates to speech recognition and, more particularly, to feature extraction for noisy speech.
BACKGROUND OF INVENTION
A speech recognizer operates with a suitable front-end which typically provides a periodical feature vector every 10-20 milliseconds. The feature vector typically comprises mel-frequency cepstral coefficients (MFCC). See, for example, S. B. Davis and P. Mermelstein, “Comparison of Parametric Representation for Monosyllabic Word Recognition in Continuously Spoken Sentences,”
IEEE Transaction Acoustics, Speech and Signal Processing
, ASSP-28(4) 357-366, August 1980.
Operating in acoustic environments under noisy background, the feature vectors such as MFCC will result in noisy features and dramatically degrade the speech recognition. See, for example, applicant's article (Y. Gong) entitled “Speech Recognition in Noisy Environments: A Survey,”
Speech Communication
, 16 (3): 261-291, April 1995.
It is highly desirable to provide a method or apparatus to reduce the error rate based on a noisy feature vector. Earlier work in this direction to reduce the error rate includes vector smoothing. See H. Hattori and S. Sagayama entitled “Vector Field Smoothing Principle for Speaker Adaptation,”
International Conference on Spoken Language Processing
, Vol. 1, pp. 381-384, Banff, Alberta, Canada, October 1992. Another earlier work in the same direction is statistical mapping as described by Y. M. Cheng, D. O'Shaughnessy and P. Mermelstein entitled “Statistical Signal Mapping: A General Tool for Speech Processing,” in
Proc. of
6
th IEEE Workshop on Statistical Signal and Array Processing
, pp. 436-439, 1992. Another work in the same direction is base transformation described by W. C. Treurniet and Y. Gong in “Noise Independent Speech Recognition for a Variety of Noise Types,” in
Proc. of IEEE International Conference on Acoustics, Speech and Signal Processing
,” Vol. 1, pp. 437-440, Adelaide, Australia, April 1994. Another work in the same direction is code-dependent cepstral normalization described by A. Acero in “Acoustical and Environmental Robustness in Automatic Speech Recognition,” Kluwer Academic Publishers, 1993.
All of these prior art works require noisy speech database to train the relationships and they assume that the parameters of the noise encountered in the future will be a replica of that of the noise in the training data.
In practical situations such as speech recognition in automobiles, the noise statistics change from utterance to utterance, the chance that noisy speech database that adequately represents the future environment is low. Therefore, the applicability of any method based on training on noisy speech data is limited.
SUMMARY OF INVENTION
In accordance with one embodiment of the present invention, a method to obtain an estimate of clean speech is provided wherein one Gaussian mixture is trained on clean speech and a second Gaussian mixture is derived from the Gaussian mixture using some noise samples.
The present method exploits mapping noisy observation back to its clean correspondent using relationships between noise feature space and clean feature space.
REFERENCES:
patent: 5583961 (1996-12-01), Pawlewski et al.
patent: 5924065 (1999-07-01), Eberman et al.
patent: 6188982 (2001-02-01), Chiang
patent: 6202047 (2001-03-01), Ephraim et al.
patent: 6205424 (2001-03-01), Goldenthal et al.
patent: 6438513 (2002-08-01), Pastor et al.
patent: 6445801 (2002-09-01), Pastor et al.
patent: 6449594 (2002-09-01), Hwang et al.
Brady III W. James
McFadden Susan
Telecky , Jr. Frederick J.
Texas Instruments Incorporated
LandOfFree
Speech recognition front-end feature extraction for noisy... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Speech recognition front-end feature extraction for noisy..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Speech recognition front-end feature extraction for noisy... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3126626