Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition
Reexamination Certificate
1998-10-13
2001-06-19
Hudspeth, David R. (Department: 2741)
Data processing: speech signal processing, linguistics, language
Speech signal processing
Recognition
C704S257000, C704S254000
Reexamination Certificate
active
06249763
ABSTRACT:
FIELD OF THE INVENTION
This invention relates to a speech recognition apparatus and a method thereof for recognizing words of a specific foreign language contained in a speech spoken by a speaker who has a specific native language, for example, a speech recognition apparatus and a method thereof for recognizing an English speech spoken by a Japanese speaker to output data (text data) indicating a string of English words contained in the speech.
This invention also relates to a pronunciation correcting apparatus and method for teaching a correct pronunciation to a speaker to correct the pronunciation utilizing data (candidate word data) obtained in said speech recognition apparatus and in the course of practicing said method.
BACKGROUND OF THE INVENTION
A speech recognition apparatus has been so far used for recognizing words contained in a speech spoken by an unspecified speaker to output the words as text data.
PUPA 06-12483, PUPA 08-50493 and PUPA 09-22297 (references
1
-
3
), for example, disclose such speech recognition methods.
For example, when English text data is generated from an English speech spoken by a Japanese speaker by an English speech recognition apparatus for recognizing English words from an English speech using a conventional speech recognition method, the recognition rate is low. This is because English language contains a sound which does not exist in Japanese language (th, etc.) or a sound which is difficult to be discriminated in Japanese language (l, r, etc.) and Japanese speaker are not generally capable of pronouncing such English sound correctly so that the English speech recognition apparatus translates an incorrect pronunciation into a word as it is. For example, even when a Japanese speaker intends to pronounce “rice” in English, the English speech recognition apparatus may recognize this pronunciation as “lice” or “louse”.
Such inexpediences may occur in various situations such as when an American whose native language is English uses a speech recognition apparatus for generating a Japanese text from a speech in Japanese contrary to the above, when a British speaker whose native language is British English uses a speech recognition apparatus tuned for American English, or when a particular person has a difficulty to pronounce correctly by some reason.
The speech recognition apparatus disclosed in the above references, however, are unable to solve such inexpediences.
If English pronunciation of the speaker is improved approaching a pronunciation of a native speaker, the recognition rate of the speech recognition apparatus is naturally improved and it is in fact desirable for a speaker to improve English conversation.
For example, PEPA4-54956 discloses a learning apparatus for recognizing English speech of a speaker and causes the speaker to affirm the recognized English speech (reference
4
).
Also, PUPA60-123884, for example discloses an English learning machine for letting the speaker listen to a speech to learn by using a speech synthesizer LSI (reference
5
).
A learning apparatus for learning pronunciation of foreign language is disclosed in many other publications including PEPA44-7162, PEPA H7-117807, PUPA61-18068, PEP H8-27588, PUPA62-111278, PUPA62-299985, PUPA3-75869, PEPA6-27971, PEPA8-12535, and PUPA3-226785 (references
6
to
14
).
However, the speaker can not necessarily attain a sufficient learning effect using the learning apparatuses disclosed in these references because the speaker has to compare his or her own pronunciation with a presented pronunciation or he or she fails to find which part of his or her pronunciation is wrong.
SUMMARY OF THE INVENTION
This invention is conceived in view of the above described problems of the conventional technology and aims at providing a speech recognition apparatus and a method thereof for recognizing words contained in a speech of a predetermined language spoken by a speaker whose native language is other than the predetermined language (non native) and translating the words into the words of the predetermined language intended by the speaker to generate correct text data.
It is also an object of this invention to provide a speech recognition apparatus and a method thereof for translating a speech spoken by a speaker in any region into a word intended by the speaker to enable correct text data to be generated even when pronunciation of a same language varies due to the difference of the regions where the language is spoken.
It is also an object of this invention to provide a speech recognition apparatus and a method thereof which compensates for the difference of pronunciation by individuals to maintain a consistently high recognition rate.
It is another object of this invention to provide a pronunciation correcting apparatus and method for pointing out a problem of a speaker's pronunciation, and letting the speaker learn a native speaker's pronunciation to correct the speaker's pronunciation by utilizing data obtained from said speech recognition apparatus and in the course of practicing said method.
It is still another object of this invention to provide a speech correcting apparatus and method for correcting pronunciation which is capable of automatically comparing speaker's pronunciation with a correct pronunciation to point out an error and presenting detailed information indicating how the speaker should correct the pronunciation.
In order to achieve the above objectives, this invention provides a first speech recognition apparatus for recognizing words from speech data representing one or more words contained in a speech comprising; candidate word correlating means for correlating each of one or more of said speech data items of words to one or more sets of candidates (candidate words) comprising a combination of one or more of said words obtained by recognizing each of one or more of said speech data items, analogous word correlating means for correlating each of said candidate words correlated to each of one or more of the speech data items of said words to null or more sets of a combination of one or more of said words (analogous words) which may correspond to pronunciation of each of said candidate words, and speech data recognition means for selecting either said candidate word correlated to each of one or more of said speech data items of words or said analogous word correlated to each of said candidate word as a recognition result of each of said speech data items of words.
Preferably, said speech data represents one or more words contained in a speech of a predetermined language, said candidate correlating means correlates each of one or more speech data items of said words to one or more sets of candidate words of said predetermined language obtained by recognizing each of the one or more speech data items, said analogous word correlating means correlates each of said candidate words correlated to each of the one or more speech data items of said words to null or more sets of analogous words of said predetermined language which may correspond to the pronunciation of each of said candidate words, and said speech data recognition means selects either said candidate word correlated to each of one or more of said speech data items of words or said analogous word correlated to each of said candidate word as a recognition result of each of one or more of speech data items of said words.
Preferably, the speech of said predetermined language is pronounced by a speaker who mainly speaks a language other than said predetermined language, the speech recognition apparatus is provided with analogous word storage means for storing null or more sets of words of said predetermined language which may correspond to each of one or more speech data items of the words contained in the speech of said predetermined language in correlation to each of one or more words of said predetermined language as said analogous word of each of one or more words of said predetermined language when each of one or more words of said predetermined language is pronounced by said speaker, an
Hudspeth David R.
International Business Machines - Corporation
Otterstedt Paul J.
Wieland Susan
LandOfFree
Speech recognition apparatus and method does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Speech recognition apparatus and method, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Speech recognition apparatus and method will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2533944