Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition
Reexamination Certificate
1999-09-15
2002-08-20
Dorvil, Richemond (Department: 2654)
Data processing: speech signal processing, linguistics, language
Speech signal processing
Recognition
C704S252000, C704S253000
Reexamination Certificate
active
06438521
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to a speech recognition method and apparatus for recognizing input speech, and a computer-readable memory.
2. Description of the Related Art
In a speech recognition technique, standard patterns (words or sentences) of standard input information are registered in some form in advance, and speech recognition is performed by comparing the registered standard patterns with an input utterance. Registration forms include, for example, forms using a phonemic expression and generative grammar. In speech recognition, scores representing the similarity between input speech and the standard patterns are determined, and a standard pattern exhibiting the highest score is determined as a speech recognition result.
As a method of inputting speech to be subjected to speech recognition, a method of inputting speech by separating an utterance into syllables of the speech is available. When, for example, “kanagawa” is to be input, the user separately utters the respective syllables, like “ka, na, ga, wa”. This input method is called single syllable articulation.
In speech recognition for speech input by single syllable articulation, the following two methods have been used.
1. Speech recognition is performed for speech information obtained by removing periods regarded as silence periods from input speech.
2. Patterns input by single syllable articulation are also registered as patterns to be subjected to speech recognition, and speech recognition is performed, including speech recognition for each pattern.
According to method 1, periods regarded as silence periods are removed from input speech, and speech recognition is performed for the speech information obtained by connecting the remaining periods of speech (FIG.
7
).
According to method 2, when the input speech is “kanagawa”, not only the pattern “kanagawa” but also the pattern “ka (silence period) na (silence period) ga (silence period) wa” are registered as standard patterns. When the highest score is obtained between the input speech and the standard pattern registered as “ka (silence period) na (silence period) ga (silence period) wa”, “kanagawa” is used as the speech recognition result.
The following problems are posed in the above speech recognition.
First, in method 1, erroneous determination of voiced/silence periods adversely affects the recognition result. To accurately determine whether given speech is silence, processing similar to speech recognition is required. In this case, problems similar to those posed in method 2 arise.
In method 2, two types of standard patterns, i.e., a pattern input by single syllable articulation and a pattern input by the other method, must be registered for each input speech. This leads to a large processing amount. In general, the recognition rate is often low in the environment at the beginning of a word (immediately after an silence period). In single syllable articulation, each syllable exists in the environment at the beginning of a word, and the reliability of the recognition result is low. There is another problem. In many cases, speech recognition is executed together with speech segmentation processing of automatically detecting the start and end points of an utterance. In single syllable articulation, the presence of an silence period between syllables tends to cause a speech segmentation error, i.e., erroneously recognizing an silence period inserted in a word as the end of an utterance. When such a speech segmentation error occurs, the probability of an accurate speech recognition result obtained by speech recognition for the speech segment is low.
SUMMARY OF THE INVENTION
The present invention has been made in consideration of the above problem, and has as its object to provide a speech recognition method and apparatus which can recognize speech with high efficiency and accuracy, and a computer-readable memory.
In order to achieve the above object, a speech recognition apparatus according to the present invention determines whether speech separately uttered as a single syllable is included in the input speech by comparing a first score calculated by comparing an arbitrary syllable sequence with the input speech and a second score calculated by a speech recognition result of the input speech and prompts the user to input speech again by speaking continuously without any pause on the basis of the determination result.
A speech recognition method is also presented according to the present invention where it is determined whether speech separately uttered as a single syllable is included in the input speech by comparing a first score calculated by comparing an arbitrary syllable sequence with the input speech and a second score calculated by a speech recognition result of the input speech and the user is prompted to input speech again by speaking continuously without any pause on the basis of the determination result.
In an alternative method of recognizing input speech, a speech input is received from a speaker and a first score of a single syllable sequence using a speech recognition algorithm and a second score of a speech recognition result for the input speech are calculated. The first and second scores are compared and the speech recognition result is output when the second score is larger than the first score.
In order to achieve the above object, a computer-readable memory according to the present invention includes a program code for recognizing input speech for configuring a system to determine whether speech separately uttered as a single syllable is included in the input speech by comparing a first score calculated by comparing an arbitrary syllable sequence with the input speech and a second score calculated by a speech recognition result of the input speech and prompt the user to input speech again by speaking continuously without any pause on the basis of the determination result.
Other features and advantages of the present invention will be apparent from the following description taken in conjunction with the accompanying drawings, in which like reference characters designate the same or similar parts throughout the figures thereof.
REFERENCES:
patent: 3646576 (1972-02-01), Griggs
patent: 5170432 (1992-12-01), Hackbarth et al.
patent: 5689616 (1997-11-01), Li
patent: 5794196 (1998-08-01), Yegnanarayanan et al.
patent: 6057520 (2000-05-01), Lee
patent: 2000-099070 (2000-04-01), None
□ TextAssist™ (“User's Guide,” Creative Labs © 1993).*
Communication for “European Search Report” dated Feb. 8, 2001.
Kellner et al., “With a Little Help From The Database-Developing Voice-Controlled Directory Information Systems”, IEEE (1997).
Attwater et al., “Issues in large-vocabulary interactive speech systems”, BT Technology Journal, vol. 14, (Jan. 1996).
IBM Technical Disclosure Bulletin, “Combined Multiple Acoustic Models to Retrieve Data”, vol. 38, No. 11, (Nov. 1995).
Komori Yasuhiro
Nakagawa Ken-ichiro
Yamada Masayuki
Dorvil Richemond
Morgan & Finnegan L.L.P.
Nolan Daniel
LandOfFree
Speech recognition method and apparatus and... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Speech recognition method and apparatus and..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Speech recognition method and apparatus and... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2941717