Confidence measures using sub-word-dependent weighting of...

Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S252000, C704S255000

Reexamination Certificate

active

06539353

ABSTRACT:

BACKGROUND OF THE INVENTION
The present invention relates to speech recognition. In particular, the present invention relates to confidence measures in speech recognition.
In speech recognition systems, an- input speech signal is converted into words that represent the verbal content of the speech signal. This conversion is complicated by many factors including differences between speakers, inconsistent pronunciation by a single speaker and the inherent complexity of languages. Because of these complexities, speech recognition systems have been unable to recognize speech with one hundred percent accuracy.
In acknowledgement of this limited accuracy, many speech recognition systems include confidence measure modules that determine the likelihood that the speech recognition system has properly identified a particular word. For example, if a speech recognition system identifies a word as “PARK”, the confidence measure indicates how likely it is that the word is actually “PARK” and not some similar word such as “PART” or “DARK”.
Such confidence measures typically make decisions on a word level using word-level or sub-word level features. Because word-level features are usually task-dependent, it is difficult to use them outside of the speech recognition task they were designed for. Sub-word-level features, on the other hand, are more general and can be used for a variety of speech recognition tasks. Traditionally, sub-word features are used to generate sub-word confidence measures, which are averaged to derive an overall confidence measure for a word. For reasons discussed further below, such averaging is less than ideal for confidence measures. As such, an improved confidence measure is desired.
SUMMARY OF THE INVENTION
A method and apparatus is provided for speech recognition. The method and apparatus convert an analog speech signal into a digital signal and extract at least one feature from the digital signal. A hypothesis word string that consists of sub-word units is identified from the extracted feature. For each identified word, a word confidence measure is determined based on weighted confidence measure scores for each sub-word unit in the word. The weighted confidence measure scores are created by applying different weights to confidence scores associated with different sub-words of the hypothesis word.
In another aspect of the invention, the weights of the weighted confidence measure scores are determined using training data including speech waveform data and their transcriptions.


REFERENCES:
patent: RE31188 (1983-03-01), Pirz et al.
patent: 4783803 (1988-11-01), Baker et al.
patent: 4797929 (1989-01-01), Gerson et al.
patent: 4802231 (1989-01-01), Davis
patent: 5241619 (1993-08-01), Schwartz et al.
patent: 5509104 (1996-04-01), Lee et al.
patent: 5566272 (1996-10-01), Brems et al.
patent: 5613037 (1997-03-01), Sukkar
patent: 5625748 (1997-04-01), McDonough et al.
patent: 5675706 (1997-10-01), Lee et al.
patent: 5677990 (1997-10-01), Junqua
patent: 5710864 (1998-01-01), Juang et al.
patent: 5710866 (1998-01-01), Alleva et al.
patent: 5712957 (1998-01-01), Waibel et al.
patent: 5749069 (1998-05-01), Komori
patent: 5797123 (1998-08-01), Chou et al.
patent: 5805772 (1998-09-01), Chou et al.
patent: 5842163 (1998-11-01), Weintraub
patent: 5937384 (1999-08-01), Huang et al.
patent: 5983177 (1999-11-01), Wu et al.
patent: 6029124 (2000-02-01), Gillick et al.
patent: 2001/0018654 (2001-08-01), Hon et al.
Tatsuya Kawahara et al. “Combining Key-Phrase Detection and Subword-Based Verification for Flexible Speech Understanding,” Proc. IEEE ICASSP 1997, vol. 2, p. 1159-1162, Apr. 1997.*
Tatsuya Kawahara et al. “Flexible Speech Understanding Based on Combined Key-Phrase Detection and Verification,” IEEE Trans. on Speech and Audio Processing, vol. 6, No. 6, p. 558-568, Nov. 1998.*
Asadi, A. et al., “Automatic Modeling of Adding New Words to a Large-Vocabulary Continuous Speech Recognition System,” in proc. of theIEEE International Conference on Acoustics, Speech and Signal processing, pp. 305-308 (1991).
Huang, X, et al., “Microsoft Windows Highly Intelligent Speech Recognizer: Whisper,” In proc. of theIEEE Interntional Conference on Acoustics, Speech and Signal Processing, Detroit, pp. 93-96 (May 1995).
Sukkar, R. et al., “Utterance Verification of Keyword Strings Using Word-Based Minimum Verification Error (WB-MVE) Training”, in proc. of theIEEE International Conference on Acoustics, Speech and Signal Processing, Atlanta, GA, pp. 518-521 (May 1996).
Rahim, M.G. et al., “Discriminative Utterance Verification Using Minimum String Verification Error (MSVE) Training,” in proc. of theIEEE International Conference on Acoustics, Speech and Signal Processing,Atlanta, GA, pp. 3585-3588 (May 1996).
Eide, E. et al., “Understanding and Improving Speech Recognition Performance Through the Use of Diagnostic Tools,”in proc. of theIEEE International Conference on Acoustics, Speech and Signal Processing, Detroit, pp. 221-224 (May 1995).
Chase, L., “Word and Acoustics Confidence Annotation for Large Vocabulary Speech Recognition,” in proc. of theEuropean Conference on Speech Communication and Technology, Rhodes, Greece, pp. 815-818 (Sep. 1997).
Schaaf T. et al., “Confidence Measures for Spontaneous Speech Recognition,”in proc. of theIEEE International Conference on Acoustics, Speech and Signal Processing, Munich, Germany,pp. 875-878 (May 1997).
Siu, M. et al., “Improved Estimation, Evaluation and Applications of Confidence Measures for Speech Recognition,” in proc. of theEuropean Conference on Speech Communication and Technology, Rhodes, Greece, pp. 831-834 (Sep. 1997).
Weintraub, M. et al., “Neural Network Based Measures of confidence for Word Recognition,”in proc. of theIEEE International Conference on Acoustics, Speech and Signal Processing, Munich, Germany, pp. 887-890 (May 1997).
Modi P. et al., “Discriminative Utterance Verificatin Using Multiple Confidence Measures,” in proc. of theEuropean Conference on Speech Communication and Technology, Rhodes, Greece, pp. 103-106 (Sep. 1997).
Rivlin, Z. et al., “A Phone-Dependent Confidence Measure for Utterance Rejection,” in proc. of theIEEE International Conference on Acoustics, Speech and Signal Processing, Atlanta, GA, pp. 515-517 (May 1996).
Hwang, M.Y. et al., “Predicting Unseen Triphone with Senones,” in proc. of theIEEE International Conference on Acoustics, Speech and Signal Processing, Minneapolis, MN, pp. 311-314 (Apr. 1993).
Alleva, F. et al., “Improvements on the Pronunciation Prefix Tree Search Organization,”in proc. of theIEEE International Conference on Acoustics, Speech and Signal Processing, Atlanta, GA, pp. 133-136 (May 1996).
Rohlicek et al., “Continuous Hidden Markov Modeling for Speaker-Independent Word Spotting”, IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 1, pp. 627-630, 1989.
Rose et al., “A Hiddeen Markov Model Based Keyword Recognition System1”, IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 1, pp. 129-132, 1990.
Alleva et al., “Confidence Measure and Their Application to Automatic Speech Recognition”, IEEE Automatic Speech Recognition Workshop, (Snowbird, Utah), pp. 173-174, 1995.
Cox et al., “Confidence Measures for the Switchboard Database”, IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 1, pp. 511-514, 1996.
Jeanrenaud et al., “Large vocabulary Word Scoring as a Basis for Transciption Generation”, Proceedings of Eurospeech, vol. 3, pp. 2149-2152, 1995.
Weintraub, “LVCSR Log-Likelihood Ration Scoring for Keyword Spotting”, IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 1, pp. 297-300, 1995.
Neti et al., “Word-Based Confidence Measures as a Guide for Stack Search in Speech Recognition”, IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 2, pp. 883-886, 1997.
Huang et al., “Microsoft Windows Highly Intelligent Speech Recognizer: Whisper”, IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 1, pp. 93-96, 1995.
Huang et al., “Whistler: A Train

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Confidence measures using sub-word-dependent weighting of... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Confidence measures using sub-word-dependent weighting of..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Confidence measures using sub-word-dependent weighting of... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3025031

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.