Speech feature extraction system

Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S205000, C704S206000, C708S300000

Reexamination Certificate

active

06493668

ABSTRACT:

BACKGROUND OF THE INVENTION
This invention relates to a speech feature extraction system for use in a speech recognition, voice identification, voice authentication systems. More specifically, this invention relates to a speech feature extraction that can be used to create a speech recognition system or other speech processing with a reduced error rate.
Generally, a speech recognition system is an apparatus that attempts to identify spoken words by analyzing the speaker's voice signal. Speech is converted into an electronic form from which features are extracted. The system then attempts to match a sequence of features to previously stored sequence of models associated with known speech units. When a sequence of features matches a sequence of models in accordance with specified rules, the corresponding words are deemed to be recognized by the speech recognition system.
However, background sounds such as radios, car noise, other nearby speakers can make it difficult to extract useful features from the speech. In addition, ambient conditions, such as the use of a different microphones or telephone handsets, a different telephone line, the speaker's distance from the microphone interfere with system performance. Differences between speakers, changes in speaker intonation or emphasis, and even the speakers health can also adversely impact system performance. For a further description of some of these problems, see Richard A. Quinnell, “Speech Recognition: No Longer a Dream, But Still a Challenge,” EDN Magazine, Jan. 19, 1995, p. 41-46.
In most speech recognition systems, the speech features are extracted by cepstral analysis, which generally involves measuring the energy in specific frequency bands. The product of that analysis reflects the amplitude of the signal in those bands. Analysis of these amplitude changes over successive time periods can be modeled as an amplitude modulated signal.
Whereas the human ear is a sensitive to frequency modulation as well as amplitude modulation in received speech signals, this frequency modulated content is only partially reflected in systems that perform cepstral analysis.
Accordingly, it would be desirable to provide a speech feature extraction system capable of capturing the frequency modulation characteristics of speech, as well as previously known amplitude modulation characteristics.
It also would be desirable to provide speech recognition and other speech processing systems that incorporate feature extraction systems that provide information on frequency modulation characteristics of the input speech signal.
SUMMARY OF THE INVENTION
In view of the foregoing, it is an object of the present invention to provide a speech feature extraction system capable of capturing the frequency modulation characteristics of speech, as well as previously known amplitude modulation characteristics.
It also is an object of this invention to provide speech recognition and other speech processing systems that incorporate feature extraction systems that provide information on frequency modulation characteristics of the input speech signal.
The present invention provides a speech feature extraction system that reflects frequency modulation characteristics of speech as well as amplitude characteristics. This is done by a feature extraction stage that included a plurality of complex band pass filters in adjacent frequency bands. The output of alternate complex band pass filters is multiplied by the conjugate of the output of the bandpass filter in the adjacent lower frequency band and the resulting signal is low pass filtered.
Each of the low pass filter outputs is processed to compute two components: a FM component that is substantially sensitive to the frequency of the signal passed by the adjacent bandpass filters from which the low pass filter output was generated, and an AM component that is substantially sensitive to the amplitude of the signal passed by the adjacent bandpass filters. The FM component reflects the difference in the phase of the outputs of the adjacent bandpass filters used to generate the lowpass filter output.
The AM and FM components are then processed using known feature enhancement techniques, such as discrete cosine transform, melscale translation, mean normalization, delta and acceleration analysis, linear discriminant analysis and principal component analysis, to generate speech features suitable for statistical processing or other recognition or identification methods.


REFERENCES:
patent: 4221934 (1980-09-01), Schiff
patent: 4300229 (1981-11-01), Hirosaki
patent: 4660216 (1987-04-01), Claasen et al.
patent: 4729112 (1988-03-01), Millar
Jelinek, Frederick, “Hidden Markov Models,”Statistical Methods for Speech Recognition,Chapter 2, The MIT Press: pp. 15-37 (1997).
Quinnell, Richard A., “Speech Recognition: No Longer a Dream, But Still a Challenge,”EDN Magazine: pp. 41-46 (Jan. 19, 1995).

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Speech feature extraction system does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Speech feature extraction system, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Speech feature extraction system will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2941574

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.