Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition
Reexamination Certificate
2002-03-07
2004-05-25
Dorvil, Richemond (Department: 2654)
Data processing: speech signal processing, linguistics, language
Speech signal processing
Recognition
C704S220000, C704S238000, C704S207000, C704S268000, C704S203000, C704S253000, C704S223000
Reexamination Certificate
active
06741962
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to a speech recognition system and a standard pattern preparation system for preparing standard patterns to be used for the speech recognition process by the speech recognition system as well as a method of preparing the standard patterns and a computer program for preparing the standard patterns, and more particularly to a speech recognition system for recognizing a narrow-band frequency speech such as a telephone speech recognition.
2. Description of the Related Art
FIG. 1
is a block diagram illustrative of the conventional speech recognition system. The conventional speech recognition system includes a characteristic extraction unit
100
and a pattern reference unit
103
. The characteristic extraction unit
100
receives an input of a voice
105
and converts the voice into a characteristic vector time series. The pattern reference unit
103
receives the characteristic vector time series and compares the characteristic vector time series with a standard pattern
104
for the speech recognition before the pattern reference unit
103
outputs a speech recognition result
106
. This conventional speech recognition system is addressed in 1995, entitled “the fundamentals of the voice recognition” NTT Advanced Technology.
A melcepstrum characteristic extraction may be available for the characteristic extraction unit
100
. The characteristic extraction unit
100
further includes a power spectrum calculation unit
101
for calculating a power spectrum in a short term of the input voice
105
and a melcepstrum calculation unit
102
receiving the power spectrum from the power spectrum calculation unit
101
and performing a mel conversion and a cosine conversion of a logarithm of the power spectrum, thereby extracting a melcepstrum characteristic quantity.
FIG. 2
is a block diagram illustrative of a conventional standard pattern preparation system. The conventional standard pattern preparation system prepares the above described standard pattern
104
to be referred by the above described conventional speech recognition system shown in FIG.
1
. The conventional standard pattern preparation system includes a characteristic extraction unit
200
and a standard pattern preparation unit
204
. The characteristic extraction unit
200
further includes a power spectrum calculation unit
201
for calculating a power spectrum in a short term of a learning voice signal from a learning voice storing unit
203
, and a melcepstrum calculation unit
202
receiving the power spectrum from the power spectrum calculation unit
201
and performing a mel conversion and a cosine conversion of a logarithm of the power spectrum, thereby extracting a melcepstrum characteristic quantity.
The standard pattern preparation unit
204
receives the melcepstrum characteristic quantity from the melcepstrum calculation unit
202
and prepares a standard pattern. The standard pattern is stored in a standard pattern storing unit
205
.
With reference again to
FIG. 1
the process for recognition of the narrow band frequency voice such as the telephone voice by the conventional speech recognition system will be described.
The telephone voice has a narrow frequency band, and is likely to receive a substantive influence by noises, for which reason it is generally difficult to recognize the voice. The frequency band of the telephone voice is ranged from 300 Hz to 3400 Hz. A first formant of the vowel or the primary characteristic frequency region is important for the speech recognition. This first formant of the vowel or the primary characteristic frequency region exists under 300 Hz, depending on a speaker. In this case, the voice signal entered from the telephone terminal may be free of the first formant of the vowel or the primary characteristic frequency region under 300 Hz.
A frequency range of the friction noise may often be over 3000 Hz. In this case, the voice signal entered from the telephone terminal may be free of the friction noise.
The restriction on the frequency band causes that the recognition of the telephone voice with the narrow frequency band is lower in accuracy than the recognition of the microphone voice with a wide frequency band.
Japanese laid-open patent publication No. 2000-250577 discloses the following conventional technique for improving the frequency characteristic of the voice with the narrow frequency band entered from the microphone. This conventional technique prevents any lack of the voice information and also improves the speech recognition characteristic in the presence of the noises. A characteristic vector is selected by a first code book from a voice input pattern as received by a second voice receiver. A correction vector is selected from a second code book in correspondence with the index of the selected vector. Both the characteristic vector and the correction vector are then added to presume the characteristic vector of the voice received by the second receiver which ensures a higher voice-receiving sensitivity in a wide frequency band than the first receiver.
In the above circumstances, the development of a novel speech recognition system is desirable.
SUMMARY OF THE INVENTION
Accordingly, it is an object of the present invention to provide a novel recognition system free from the above problems.
It is a further object of the present invention to provide a novel recognition system exhibiting such a high performance of speech recognition in a narrow frequency band as closely to the performance in the narrow frequency band.
It is a still further object of the present invention to provide a novel standard pattern preparation system for preparing standard patterns to be used for the speech recognition process by the speech recognition system free from the above problems.
It is yet a further object of the present invention to provide a novel standard pattern preparation system for preparing standard patterns to be used for the speech recognition process by the speech recognition system exhibiting such a high performance of speech recognition in a narrow frequency band as closely to the performance in the narrow frequency band.
It is yet a further object of the present invention to provide a method of preparing the standard patterns free from the above problems.
It is yet a further object of the present invention to provide a method of preparing the standard patterns exhibiting such a high performance of speech recognition in a narrow frequency band as closely to the performance in the narrow frequency band.
It is yet a further object of the present invention to provide a computer program for preparing the standard patterns free from the above problems.
It is yet a further object of the present invention to provide a computer program for preparing the standard patterns exhibiting such a high performance of speech recognition in a narrow frequency band as closely to the performance in the narrow frequency band.
The present invention provides a speech recognition system for recognizing an input voice of a narrow frequency band. The speech recognition system includes: a frequency band converting unit for converting the input voice of the narrow frequency band into a pseudo voice of a wide frequency band which covers an entirety of the narrow frequency band and which is wider than the narrow frequency band.
The above and other objects, features and advantages of the present invention will be apparent from the following descriptions.
REFERENCES:
patent: 4881266 (1989-11-01), Nitta et al.
patent: 5455888 (1995-10-01), Iyengar et al.
patent: 5581652 (1996-12-01), Abe et al.
patent: 5799276 (1998-08-01), Komissarchik et al.
patent: 5950153 (1999-09-01), Ohmori et al.
patent: 5978759 (1999-11-01), Tsushima et al.
patent: 6236964 (2001-05-01), Tamura et al.
patent: 6539355 (2003-03-01), Omori et al.
patent: 0911807 (1999-04-01), None
patent: 62-37795 (1987-08-01), None
patent: 7-98599 (1995-04-01), None
patent: 3110105 (2000-09-01), None
patent: 2000-244653 (2000-09-01), None
patent: 2000-250577
Dorvil Richemond
Han Qi
Scully Scott Murphy & Presser
LandOfFree
Speech recognition system and standard pattern preparation... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Speech recognition system and standard pattern preparation..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Speech recognition system and standard pattern preparation... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3234858