Speech recognition front-end feature extraction for noisy...

Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition

Reexamination Certificate

Rate now

[ 0.00 ] – not rated yet Voters 0 Comments 0

Details Speech recognition front-end feature extraction for noisy... Speech recognition front-end feature extraction for noisy...

: 2000-09-21
: 2003-10-14
: McFadden, Susan (Department: 2655)
: Data processing: speech signal processing, linguistics, language
: Speech signal processing
: Recognition

: C704S256000
: Reexamination Certificate
: active
: 06633842
: ABSTRACT:

FIELD OF INVENTION
This invention relates to speech recognition and, more particularly, to feature extraction for noisy speech.
BACKGROUND OF INVENTION
A speech recognizer operates with a suitable front-end which typically provides a periodical feature vector every 10-20 milliseconds. The feature vector typically comprises mel-frequency cepstral coefficients (MFCC). See, for example, S. B. Davis and P. Mermelstein, “Comparison of Parametric Representation for Monosyllabic Word Recognition in Continuously Spoken Sentences,”
IEEE Transaction Acoustics, Speech and Signal Processing
, ASSP-28(4) 357-366, August 1980.
Operating in acoustic environments under noisy background, the feature vectors such as MFCC will result in noisy features and dramatically degrade the speech recognition. See, for example, applicant's article (Y. Gong) entitled “Speech Recognition in Noisy Environments: A Survey,”
Speech Communication
, 16 (3): 261-291, April 1995.
It is highly desirable to provide a method or apparatus to reduce the error rate based on a noisy feature vector. Earlier work in this direction to reduce the error rate includes vector smoothing. See H. Hattori and S. Sagayama entitled “Vector Field Smoothing Principle for Speaker Adaptation,”
International Conference on Spoken Language Processing
, Vol. 1, pp. 381-384, Banff, Alberta, Canada, October 1992. Another earlier work in the same direction is statistical mapping as described by Y. M. Cheng, D. O'Shaughnessy and P. Mermelstein entitled “Statistical Signal Mapping: A General Tool for Speech Processing,” in
Proc. of
6
th IEEE Workshop on Statistical Signal and Array Processing
, pp. 436-439, 1992. Another work in the same direction is base transformation described by W. C. Treurniet and Y. Gong in “Noise Independent Speech Recognition for a Variety of Noise Types,” in
Proc. of IEEE International Conference on Acoustics, Speech and Signal Processing
,” Vol. 1, pp. 437-440, Adelaide, Australia, April 1994. Another work in the same direction is code-dependent cepstral normalization described by A. Acero in “Acoustical and Environmental Robustness in Automatic Speech Recognition,” Kluwer Academic Publishers, 1993.
All of these prior art works require noisy speech database to train the relationships and they assume that the parameters of the noise encountered in the future will be a replica of that of the noise in the training data.
In practical situations such as speech recognition in automobiles, the noise statistics change from utterance to utterance, the chance that noisy speech database that adequately represents the future environment is low. Therefore, the applicability of any method based on training on noisy speech data is limited.
SUMMARY OF INVENTION
In accordance with one embodiment of the present invention, a method to obtain an estimate of clean speech is provided wherein one Gaussian mixture is trained on clean speech and a second Gaussian mixture is derived from the Gaussian mixture using some noise samples.
The present method exploits mapping noisy observation back to its clean correspondent using relationships between noise feature space and clean feature space.

REFERENCES:
patent: 5583961 (1996-12-01), Pawlewski et al.
patent: 5924065 (1999-07-01), Eberman et al.
patent: 6188982 (2001-02-01), Chiang
patent: 6202047 (2001-03-01), Ephraim et al.
patent: 6205424 (2001-03-01), Goldenthal et al.
patent: 6438513 (2002-08-01), Pastor et al.
patent: 6445801 (2002-09-01), Pastor et al.
patent: 6449594 (2002-09-01), Hwang et al.

Affiliated with

Gong Yifan

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Also associated with

Brady III W. James

Attorney

[ 0.00 ] – not rated yet Voters 0 Comments 0

McFadden Susan

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

Telecky , Jr. Frederick J.

Attorney

[ 0.00 ] – not rated yet Voters 0 Comments 0

Texas Instruments Incorporated

Corporate Assignee

[ 0.00 ] – not rated yet Voters 0 Comments 0

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Speech recognition front-end feature extraction for noisy... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Speech recognition front-end feature extraction for noisy..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Speech recognition front-end feature extraction for noisy... will most certainly appreciate the feedback.

Rate now

Comments { 0 }

Profile ID: LFUS-PAI-O-3126626

All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.

Canada

Charities
Companies
MP Candidates
Patents
Employee Salary Disclosure

World

Places of the World
Scientific Papers

United States

Banks
Companies
Counties
Patents
Employee Salary Disclosure