Speech recognition method

Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S205000

Reexamination Certificate

active

06321195

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to a speech recognition method for performing an automatic dialing function using speech recognition.
2. Description of Related Art
A human being communicates his/her thoughts to others with speech.
The speech that is a means for communication between human beings is used as a means for communication between a human being and machinery.
In other words, a speech recognition technique is applied to the operation of daily used electric and electronic equipment.
Especially, applying of the speech recognition technique to a mobile telephone accomplishes various advantages in use.
SUMMARY OF THE INVENTION
Accordingly, the present invention is directed to a speech recognition method that substantially obviates one or more of the limitations and disadvantages of the related art.
An objective of the present invention is to provide a speech recognition method for allowing dialing with speech by applying an existing speech recognition algorithm to a mobile telephone having a built-in vocoder.
Additional features and advantages of the invention will be set forth in the following description, and in part will be apparent from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure as illustrated in the written description and claims hereof, as well as the appended drawings.
To achieve these and other advantages, and in accordance with the purpose of the present invention as embodied and broadly described, in a telephone modulating an input speech and having a built-in vocoder for encoding a modulated speech signal, a speech recognition method comprises: a training step of, if a user enters a telephone number and a speech corresponding to the telephone number, performing the encoding at the vocoder, detecting only a speech section using information output as a result of the encoding, and extracting and storing a feature of the detected speech section; a recognition step of, if an input speech is received, performing encoding at the vocoder, detecting only a speech section using information output as a result of the encoding, extracting a feature of the detected speech section, comparing the extracted feature with features of registered words stored during the training step, and selecting a registered word having a feature most similar to that of the input speech; and a step of determining a result of the recognition to be right if a similarity of the registered word selected at the recognition step does not exceed a predetermined threshold and automatically dialing a telephone number corresponding to the recognized word.
The training step and the recognition step are characterized by detecting only the actually voiced speech section from the input signal, using codebook gain as energy information, the codebook gain being output as the result of the encoding by the vocoder.
The training step and the recognition step are characterized by extracting spectrum coefficients of the frames corresponding to the speech section as features, the coefficients being output as the result of the encoding if the speech section is detected.
The recognition step is characterized by comparing the extracted features with the features of the registered words stored during the training step to select the registered word having the feature most similar to that of the input speech if the features of the frames corresponding to the speech section.
The recognition step is characterized by extracting line spectrum pair (LSP) parameters that have been encoded at the vocoder and transforming the extracted LSP parameters into pseudo-cepstrums
The recognition step is characterized by using dynamic time warping (DTW) in comparing spectrum coefficients extracted from the input speech with spectrum coefficients of each word registered during the training step.
The recognition step is characterized by performing a pre-selection step prior to the DTW for selection of the registered word having the feature most similar to that of the input speech.
The pre-selection step is characterized by performing the DTW using only a part of spectrum information extracted from each frame to select a predetermined number of registered words having relatively high similarities and subsequently performing the DTW with respect to the selected registered words to finally select a registered word having the highest similarity to the input speech.
The pre-selection step is characterized by selecting a predetermined number of registered words having relatively high similarities using a linear matching method and subsequently performing DTW with respect to the selected registered words to finally select a registered word having the highest similarity to the input speech.
The simplest one of speech recognition techniques is speaker-dependent isolated word recognition.
According to this technique, only a previously trained one person's speech can be recognized and only a speech voiced in unit of words (or short sentences) can be recognized.
There are various existing speech recognition algorithms. They can be largely classified into a speech section detecting process, a feature extracting process, and a matching process.
Such processes require a relatively large amount of calculation, so a high speed processor is needed. However, a mobile telephone on the market is equipped with a built-in vocoder where spectrum parameters of the speech are extracted, so the present invention is advantageous in that the special feature extracting process is not needed.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory and are intended to provide further explanation of the invention as claimed.


REFERENCES:
patent: 4797929 (1989-01-01), Gerson et al.
patent: 4839844 (1989-06-01), Watari
patent: 4870686 (1989-09-01), Gerson et al.
patent: 5007081 (1991-04-01), Schuckal et al.
patent: 5809453 (1998-09-01), Hunt
Neural Networks for Signal Processing IV. Proceeding of the 1994 IEEE Workshop. Matsuura et al., “Word Recognition using a neural network and a phonetic based DTW” Sep. 1994.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Speech recognition method does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Speech recognition method, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Speech recognition method will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2588167

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.