Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition
Reexamination Certificate
1999-04-15
2002-05-14
Korzuch, William (Department: 2641)
Data processing: speech signal processing, linguistics, language
Speech signal processing
Recognition
Reexamination Certificate
active
06389393
ABSTRACT:
FIELD OF THE INVENTION
This invention relates to modeling speech for speech recognition and more particularly to provide speech recognition in a noisy environment.
BACKGROUND OF THE INVENTION
Automatic speech recognizers exhibit rapid degradation in performance when there is a mismatch between training and testing conditions. This mismatch can be caused by speaker variability, additive acoustic environmental noise and convolutive distortions due to the use of different telephone channels. All these variabilities are also present in an automobile environment and this degrades the performance of speech recognizers when used in an automobile.
Several techniques have been proposed to improve the robustness of speech recognizers under mismatch conditions (Y. Gong, “Speech Recognition in Noisy Environments: A Survey,”
Speech Communication
, 16(3): 261-291, April 1995). These techniques fall under the following two main categories:
feature pre-processing techniques such as spectral subtraction, cepstral mean normalization (CMN), which aim at modifying the corrupted features so that the resulting features are closer to those of clean speech.
model adaptation techniques such as maximum likelihood linear regression (C. J. Leggetter and P. C. Woodland, “Maximum Likelihood Linear Regression for Speaker Adaptation of Continuous Density HMMs,”
Computer, Speech and Language
, 9(2): 171-185, 1995), maximum a posteriori (IAP) estimation (J. L. Gauvain and C. H. Lee, “Maximum A Posteriori Estimation for Multivariate Gaussian Observations of Markov Chains,”
IEEE Trans. on Speech and Audio Processing
, 2(2): 291-298, April 1994), parallel model combination (PMC) (M. J. F. Gales, “‘NICE’ Model-Based Compensation Schemes for Robust Speech Recognition,”
Proc. ESCA
-
NATO Tutorial and Research Workshop on Robust Speech Recognition for Unknown Communication Channels
, pages 55-64, April 1997), in which model parameters of the corrupted speech model are estimated, to account for the mismatch.
SUMMARY OF THE INVENTION
In accordance with one embodiment of the present invention a two-stage model adaptation scheme is provided wherein the first stage adapts speaker-independent HMM (Hidden Markov Model) seed model set to a speaker and microphone dependent model set and in the second stage the speaker and microphone dependent model set is adapted to a speaker and noise-dependent model set.
REFERENCES:
patent: 5353376 (1994-10-01), Oh et al.
patent: 5950157 (1999-09-01), Heck et al.
Angelini, B., F. Brugnara, D. Falavigna, D. Giuliani, R. Gretter, and M. Omologo, “Speaker Independent contiuous Speech Recognition using an Acoustic-Phonetic Italian Corpus,” Proc. ICSLP, Yokohama, vol. 3, pp. 1391-1394, Sep. 1994.*
Gales, M. J. F., Pye, and P. C. Woodland, “Variance Compensation within the MLLR Framework for Robust Speech Recognition and Speaker Adaptation,” Proc. Fourth Int. Conf on Spoken Language ICSLP 1996, vol. 3, pp. 1832-1835, Oct. 3-6, 1996.*
Guiliani, D., M. Matassoni, M. Omologo, and P. Svaizer, “Experiments of HMM Adaptation for Hands-Free Connected Digit Recognition,” Proc. 1998 IEEE Int. Conf. on Acoust., Speech and Sig. Proc., vol. 1, pp. 474-476, May 12-15, 1998.*
Guiliani, D., M. Matassoni, M. Omologo, and P. Svaizer, “Training of HMM with Filtered Speech Material for Hands-Free Recognition,” Proc. 1999 IEEE Int. Conf. on Acoust., Speech and Sig. Proc., vol. 1, pp. 449-452, Mar. 15-19, 1999.*
Guiliani, D., M. Omologo, and P. Svaizer, “Experiments of Speech Recognition in a Noisy and Reverberant Environment using a Microphone Array and Hmm Adaptation,” Proc. Fourth Int. Conf on Spoken Language ICSLP 1996, vol. 3, pp. 1329-1332, Oct. 3-6, 1996.*
Legetter, C. J. and P. C. Woodland, “Speaker Adaptation of Continuous Density HMMs using Multivariate Linear Regression,” Proc. ICSLP 94, vol. 2, pp. 451-454, Sep. 1994.*
Omologo, Maurizio, Marco Matassoni Piergiorgio Svaizier, and Diego Giuliani, “Microphone Array Based Speech Recognition With Different Talker-Array Positions,” ICASSP-97 1997 IEEE Int. Conf. Acoust, Speech, Sig. Proc., vol. 1, pp. 227-230, Apr. 21-24,1997.*
Woodland, P.C., D. Pye, and M. J. F. Gales, “Iterative Unsupervised Adaptation Using Maximum Likelihood Linear Regression,” Proc. Fourth Int. Conf. on Spoken Language ICSLP 96, vol. 2, pp. 1133-1136, 1996.
Storm Donald L.
Telecky , Jr. Frederick J.
Texas Instruments Incorporated
Troike Robert L.
LandOfFree
Method of adapting speech recognition models for speaker,... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method of adapting speech recognition models for speaker,..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method of adapting speech recognition models for speaker,... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2868393