Method for automatically generating pronunciation dictionary...

Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S251000, C704S232000

Reexamination Certificate

active

06236965

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to a method for automatically generating a pronunciation dictionary in a speech recognition system, in which pronunciation sequences are accurately generated for vocabularies non-registered in a lexicon (dictionary) by utilizing an exception word pronunciation dictionary, an exception grapheme pronunciation dictionary and grapheme-wise multi-layer perception, and thus the size of memory and the amount of calculations can be reduced in the stepwise processing.
2. Description of the Prior Art
FIG. 1
illustrates the constitution of the general speech recognition system.
The constitution and operation of the general speech recognition system are well known in this field, and therefore, the descriptions of them will be skipped here. However, the procedure of outputting the speech recognition result in the form of text will be described in detail below.
As shown in
FIG. 1
, if a speech is inputted in the general speech recognition system, an endpoint detecting and feature extracting section
11
detects the period In which a speech exists, so as to extract the feature vector of the period.
Meanwhile, if an information on a vocabulary which is recognizable by the speech recognition system is inputted, then a recognition candidate word list converting section
12
alters the recognition candidate word list. Then a pronunciation dictionary generating section
13
forms a pronunciation sequences of the respective words by referring to the dictionary or based on the pronunciation rules. Then a word model generating section
15
combines the generated pronunciation sequences of the pronunciation dictionary generating section
13
with respective phoneme model data bases
14
, thereby forming a word model for each candidate word.
Finally, a pattern comparing section
16
compares the word models of the word model generating section
15
with the extracted input speech feature vectors of the endpoint detecting and feature extracting section
11
so as to output the closest candidate word as the recognition result.
FIG. 2
is a flow chart showing the conventional procedure of forming the English pronunciation dictionary. That is,
FIG. 2
illustrates the procedure of forming the pronunciation dictionary by the pronunciation dictionary generating section
13
of FIG.
1
.
As shown in
FIG. 2
, the conventional procedure of generating the English word pronunciations is carried out in the following manner. That is, first a text of recognition candidate words is inputted (
201
). Then, numerals are converted to letters, and punctuations are removed, thereby carrying out a pre-processing on the text. Thus the text is converted into a processable alphabet letters (
202
).
Then a checking is carried out as to whether the English word thus obtained is present in a registered pronunciation dictionary (
203
).
If it is found that the English word is present in the registered pronunciation dictionary, then the English word pronunciation sequences are outputted (
207
).
On the other hand, if the English word is not found in the registered pronunciation dictionary, then pronunciation sequences are generated based on either one of the following two methods.
First, the English pronunciation rules are applied (
204
) to output English word pronunciation sequences (
207
).
Second, a neural network is utilized to generate articulatory features in accordance with the upstream and downstream connections of the respective graphemes (
205
). Then the articulatory features are mapped to the relevant phoneme (
206
), thereby outputting English word pronunciation sequences (
207
).
As described above, the speech recognition is a technique in which the pronunciation of the user is analyzed, and its meaning is determined.
In the conventional speech recognition system, the object vocabularies to be recognized are determined in advance, and if the user pronounces one or more of the object vocabularies, then the closest words are detected and outputted.
However, in this recognition system, it is focused to detecting proper words from the predetermined registered vocabularies. Therefore, if a non-registered new word is to be recognized, the performance is deteriorated.
When recognizing a new non-registered word by using such a recognition apparatus, there are two techniques to be solved for an accurate recognition.
One of them is a technique of properly modeling the basic pattern for a new word to carry out a modeling in phoneme or allophone. Another is technique of automatically generating pronunciation dictionary by connecting a new word to a defined phoneme or allophone.
The pronunciation dictionary automatic generation technique has different handling methods depending on the words to be handled. For example, in the case of Korean words, a proper formation of pronunciation dictionary for each word can be mostly carried out based on 10 or more pronunciation rules and several exception rules. If the words which cannot be expressed by the basic rules are provided with exception pronunciation dictionaries, then accurate pronunciation dictionaries for almost all the vocabularies and proper nouns can be generated.
In the case of English, however, it is impossible to form accurate pronunciation dictionaries for any words based on the aforementioned.
Therefore, conventionally there is formed a large scale pronunciation dictionary containing 100 thousand words or more. Then for proper nouns and coined words, the dictionary is revised or pronunciation dictionaries are formed based on simple pronunciation rules.
The conventional methods of forming the English pronunciation dictionaries are classified into two kinds. One of them is a method of programming several pronunciation rules. Another is a method of resorting to a speech synthesis. That is, the articulatory features for the respective phonemes are defined based on the phonetic knowledge. Then based on this, the articulatory features are found for the grapheme input by utilizing a neural network and then this is applied to the relevant phonemes.
However, in the former method, due to the diversified pronunciation features of the English language, there are difficulties in forming accurate pronunciation dictionaries for any arbitrary words. In the latter, it is based on the inaccurate experimental phonetic knowledge and is based on applying it to the phoneme. Therefore, it is impossible to form an accurate pronunciation dictionary.
SUMMARY OF THE INVENTION
The present invention is intended to overcome the above described disadvantages of the conventional techniques.
Therefore it is an object of the present invention to provide a method for automatically generating a pronunciation dictionary in a speech recognition system, and a recording medium readable by a program of a computer for achieving the same, in which pronunciation patterns of a large scale pronunciation dictionary are learned through a neural network without resorting to a phonetic knowledge, and the pronunciation sequences for input words are accurately formed by utilizing an exception grapheme pronunciation directory or an exception word pronunciation dictionary for graphemes or words prohibiting the formation of an accurate pronunciation dictionary through the learning neural network, thereby reducing the size of the memory and the amount of calculations.
In achieving the above object, the method for automatically generating pronunciation dictionaries in a speech recognition system according to the present invention includes the steps of: learning a multi-layer perceptron for directly mapping phonemes relevant to respective graphemes by utilizing a neural network, so as to form an exception word pronunciation dictionary data base, an exception grapheme pronunciation dictionary data base, and a phoneme output multi-layer perceptron parameter data base for respective graphemes; and inspecting the exception word pronunciation dictionary data base, the exception grapheme pronunciation dictionary data base, and the phoneme output

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method for automatically generating pronunciation dictionary... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method for automatically generating pronunciation dictionary..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method for automatically generating pronunciation dictionary... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2519874

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.