Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition
Reexamination Certificate
2000-07-25
2003-01-28
Chawan, Vijay B. (Department: 2654)
Data processing: speech signal processing, linguistics, language
Speech signal processing
Recognition
C704S231000, C704S251000
Reexamination Certificate
active
06513005
ABSTRACT:
FIELD OF THE INVENTION
The invention relates to speech recognition technology and, more particularly, to a method for correcting error characters in results of speech recognition and a speech recognition system using the same.
BACKGROUND OF THE INVENTION
Speech recognition technology is a technology for accurately recognizing human speech (for example, character, word, sub-sentence, sentence, etc.) by using computer and digital signal processing technology. Speech recognition is based on collecting the various valuable speech characteristics to be recognized, forming the acoustic model to be recognized, and comparing with the sample model stored in the computer, as well as recognizing what are the characters and words through pattern classification methods. The speech recognition process is a recognition process for syllable or word language composition, etc. No doubt speech recognition is a fast and effective way to input text into a computer. Though a great deal of research for speech recognition has been performed until now, recognition in continuous speech, speaker independent, and long word is in the exploring stage because of the complexity of the language. Thus, error correction for the results of speech recognition is an indispensable step, because the accuracy of speech recognition can never reach 100%.
Friendliness and efficiency of the alternative input methods in the error correction process is very important since it is a part of the complete speech input process, and it can be a deciding factor regarding user's acceptance of the speech input methods. Generally, different input methods such as handwriting input or various types of stroke-based input have been used to correct the error characters in the results associated with speech recognition because the users of the speech recognition system often do not want to use a keyboard or are not familiar with it, and these users more desirably use the stroke-based handwriting input methods, such as handwriting input, stroke-based input or stroke type-based input, which are approximate to the natural handwriting habits. However, such handwriting recognition technology is not a mature technology, thus the error correction efficiency for the results of speech recognition is reduced. The current various error correction methods so far used for the results of speech recognition do not taken advantage of the useful acoustic information generated from the speech recognition process.
SUMMARY OF THE INVENTION
An object of the invention is to use effectively the useful acoustic information generated from the speech recognition process, so as to improve the error correction efficiency of speech recognition, that is, to improve the reliability and speed of the error correction.
The invention fully exploits the useful acoustic information obtained in the speech recognition process to maximize the error correction efficiency for the results associated with speech recognition by using the alternative stroke-based input methods. The invention automatically retains and processes the valuable acoustic information from the speech recognition process. This is accomplished via internal data transfer and incorporation of an evaluation procedure involving several statistical models. The invention uses a confusion matrix to generate an acoustic model, and the acoustic model cooperates with character level and word level language models to optimize the error correction processing.
According to an aspect of the invention, a method for correcting one or more error characters in results of speech recognition comprises the steps of:
marking the one or more error characters in the speech recognition results;
inputting one or more correct characters corresponding to the one or more marked error characters by input based on character-shape;
recognizing the input based on character-shape;
displaying one or more candidate characters;
selecting one or more desired characters from the one or more candidate characters in accordance with the user; and
replacing the one or more error characters with the one or more selected characters;
the method characterized by further comprising the step of filtering the one or more candidate characters in accordance with acoustic information associated with the one or more error characters.
According to another aspect of the invention, a speech recognition system capable of correcting one or more error characters in results of speech recognition comprises:
voice detection means for collecting a speech sample of a user;
pronunciation probability calculation means, which, for each pronunciation in an acoustic model, gives a probability estimation value of whether the pronunciation is the same as the speech sample;
word probability calculation means, which, according to a language model, gives a probability estimation value of a word occurring in a current context;
word matching means for calculating a joint probability through combining a probability value calculated by the pronunciation probability calculation means with a probability value calculated by the word probability calculation means and taking the word with the greatest joint probability value as the result of the speech recognition;
context generating means for modifying the current context by using the speech recognition result; and,
word output means;
the speech recognition system characterized by further comprising error correction means, user marking the one or more error characters in the results of the speech recognition via the error correction means, inputting one or more correct characters corresponding to the one or more error characters by input based on characters-shape, and the error correction means recognizing the input, generating one or more candidate characters and filtering the one or more candidate characters via acoustic information associated with the one or more error characters.
REFERENCES:
patent: 5287275 (1994-02-01), Kimura
patent: 5768422 (1998-06-01), Yaeger
patent: 5883986 (1999-03-01), Kopec et al.
patent: 6340967 (2002-01-01), Maxted
patent: 6393395 (2002-05-01), Guha et al.
Chiang et al, “On Jointyl Learning the Parameter in a Character Synchronous Integrated Speed and Language Model”, IEEE Transactions on Speech and Audio Processing, May 3, 1996, pp 167-189.*
Yang et al, “Statistics-Based Segment PAttern Lexicon—A New Direction for Chinese Language Modeling”, IEEE ICASP, vol. 1 pp 169-172.*
Wang et al, “Complete Recognition of Continuous Mandarin Speech for Chinese Language with Very Large Vocabulary Using Limited Training Data”, IEEE Transactions on Speech and Audio Processing, Mar. 1997,vol. 5 #3, pp. 195-200+.
Qin Yong
Shen Li Qin
Su Hui
Tang Donald T.
Wang Qian Ying
Chawan Vijay B.
Opsasnick Michael N.
Otterstedt Paul J.
LandOfFree
Method for correcting error characters in results of speech... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method for correcting error characters in results of speech..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method for correcting error characters in results of speech... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3015912