Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition
Reexamination Certificate
2005-02-08
2008-11-18
McFadden, Susan (Department: 2626)
Data processing: speech signal processing, linguistics, language
Speech signal processing
Recognition
Reexamination Certificate
active
07454338
ABSTRACT:
A method and apparatus are provided that generate values for a first set of dimensions of a feature vector from a speech signal. The values of the first set of dimensions are used to estimate values for a second set of dimensions of the feature vector to form an extended feature vector. The extended feature vector is then used to train an acoustic model.
REFERENCES:
patent: 6917918 (2005-07-01), Rockenbeck et al.
patent: 7016838 (2006-03-01), Rockenbeck et al.
N. Morgan et al., “Meetings about meetings: research at ICSI on Speech in Muliparty Conversations,” in Proc. ICASSP, Hong Kong, Apr. 2003, vol. 4, pp. 740-743.
J.S. Garofolo et al., “The Rich Transcription 2004 Spring Meeting Recognition Evaluation,” In Proc. NIST RT04 Meeting Recognition Workshop, Montreal, Canada 2004.
P. Moreno et al., “Sources of Degradation of Speech Recognition in the Telephone Network,” in Proc. ICASSP, Adelaide, Australia, Apr. 1994, vol. I, pp. 109-112.
Z. Ghahramani et al., “Supervised Learning From Incomplete Data via an EM Approach,” in Advances in Neural Information Processing Systems, 1994.
B. Raj et al., “Reconstruction of Damaged Spectrographic Features for Robust Speech Recognition,” In Proc. ICSLP, Beijing, China, Oct. 2000.
M. Cooke, et al., “Robust Automatic Speech Recognition With Missing And Unreliable Acoustic Data,” Speech Communication, vol. 34, No. 3, pp. 267-285, Jun. 2001.
M. L. Seltzer et al., “Classifier-Based Mask Estimation For Missing Feature Methods of Robust Speech Recognition,” in Proc. ICSLP, Beijing, China 2000.
L. G. Neumeyer et al., “Training Issues and Channel Equalization Techniques for the Construction of Telephone Acoustic Models Using a High-Quality Speech Corpus,” IEEE Trans. Speech Audio Processing, vol. 2, No 4, pp. 590-597, Oct. 1994.
Y.M. Cheng et al., “Statistical Recovery of Wideband Speech from Narrowband Speech,” IEEE Trans. Speech Audio Processing, vol. 2, No. 4, pp. 544-548, Oct. 1994.
K.-Y. Park et al., “Narrowband to Wideband Conversion of Speech Using GMM Based Transformation,” in Proc. ICASSP, Istanbul, Turkey, Jun. 2000, vol. 3, pp. 1843-1846.
P. Jax et al., “ Wideband extension of telephone speech using a hidden Markov model,” in IEEE Workshop on Speech Coding, Delavan, Wisconsin, Sep. 2000, pp. 133-135.
L.R. Rabiner, “A Tutorial on Hidden Markov Models and Selected Applications In Speech Recognition,” Proc. IEEE, vol. 77, No. 2, pp. 257-286, Feb. 1990.
J.-L. Gauvain et al., “Maximum a Posteriori Estimation for Multivariate Gaussian Mixture Observations of Markov Chains,” IEEE Trans. Speech Audio Processing, vol. 2, No. 2, pp. 291-298, Apr. 1994.
K.-F. Lee et al., “Speaker-independent phone recognition using Hidden Markov Models,” IEEE Trans. Acoust., Speech, Signal Processing, vol. 37, No. 11, pp. 1641-1648, Nov. 1989.
S. Young, “The HTK Hidden Markov Model Toolkit: Design and Philosophy,” Tech. Rep., Cambridge University, 1994.
B.J. Frey et al., “ALGONQUIN: Iterating Laplace's Method to Remove Multiple Types of Acoustic Distortion for Robust Speech Recognition,” In Proc. Eurospeech, Aalborg, Denmark, Sep. 2001.
J.A. Bilmes, “A Gentle Tutorial of the EM Algorithm and Its Applications to Parameter Estimation for Gussian Mixture and Hidden Markov Models,” Tech. Rep. TR-97-021, U.C. Berkeley, Berkeley, CA, Apr. 1998.
N. Enbom et al., “Bandwidth Expansion of Speech Based on Vector Quantization of the Mel Frequency Cepstral Coefficients,” IEEE Workshop on Speech Coding, 1999.
S. Chennoukh et al., “Speech Enhancement Via Frequency Bandwidth Extension Using Line Spectral Frequencies,” ICASSP, 2001.
P. Jax et al., “Artificial Bandwidth Extension of Speech Signals Using MMSE Estimation Based on a Hidden Markov Model,” ICASSP, 2003.
Y. Qian et al., “Combining Equalization and Estimation for Bandwidth Extension of Narrowband Speech,” ICASSP, 2004.
P.J. Moreno et al., “A Vector Taylor Series Approach to Environment-Independent Speech Recognition,” ICASSP, 1996.
J. Droppo et al., “A Comparison of Three Non-Linear Observation Models for Noisy Speech Features,” Eurospeech, 2003.
A.P. Dempster et al., “Maximum Likelihood from Incomplete Data Via the EM Algorithms,” Journal of Royal Statistical Society, vol. 39, No. 1, pp. 1-38, 1977.
Acero Alejandro
Seltzer Michael L.
Magee Theodore M.
McFadden Susan
Microsoft Corporation
Westman Champlin & Kelly P.A.
LandOfFree
Training wideband acoustic models in the cepstral domain... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Training wideband acoustic models in the cepstral domain..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Training wideband acoustic models in the cepstral domain... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-4050119