Method and apparatus for a robust feature extraction for...

Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S265000, C704S233000, C704S234000

Reexamination Certificate

active

06678657

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Technical Field
The invention relates to a method and a means for performing a robust feature extraction for a speech recognition in a noisy environment.
2. Description of Related Art
In the area of speech recognition a major problem for an accurate recognition of speech occurs in case of a noisy environment. All possible different types of noise have influence on the speech recognition and may degrade a recognition accuracy drastically.
Especially in the area of mobile telephony or access systems that allow access after recognising a spoken password, speech recognition is becoming more and more important. Especially in these areas mentioned, out of the possible different types of noise, the most problematic ones are additive stationary or instationary background noise. Another type of noise degrading the recognition accuracy is the influence of frequency characteristics of a transmission channel if the speech to be recognised is transmitted via such a channel. Additive noise may consist of background noise in combination with noise generated on a transmission line.
Therefore it is known from the prior art to provide a so-called linear or non-linear spectral subtraction. Spectral subtraction is a noise suppression technique, which reduces the effects of additive noise to speech. It estimates the magnitude or power spectrum of clean speech by explicitly subtracting the noise magnitude or power spectrum from the noisy magnitude or power spectrum. Such a technique was developed for enhancing speech in various communication situations.
As spectral subtraction necessitates estimation of the noise during pauses, it is also supposed that noise characteristics change slowly, to guarantee that the noise estimation is still valid. The success of this method necessitates the availability of a robust endpoint or voice activation detector to separate speech from noise. However, a good speech and noise separation is a necessary condition but is difficult to achieve at low Signal-to-Noise Ratio (SNR).
In addition even if spectral subtraction is computationally efficient since the noise is estimated during speech pauses and even if this technique can be implemented as a pre-processing technique leaving the other processing stages unchanged, the performance of the spectral subtraction method is strongly dependant on the noise and how the noise is extracted. The problem associated with this is that even if the wide-band noise is reduced, some noise residual remains (Junqua et al; “Robustness in automatic speech recognition”; Kluwer Academic Publisher; 1996; Section 9.2 Speech Enhancement, pages 277 ff.)
Anyhow, even if with the above mentioned methods the speech recognition may be improved, for these approaches the estimation of the noise characteristics is crucial. As mentioned above, a speech to noise discrimination is needed to mark those segments of a speech signal that contains only noise. But such a discrimination can not be free of errors and is difficult to achieve. In addition to this when it is looked at segments of the speech signal which contain the superposing of speech and stationary noise, these segments can be described by the superposition of corresponding distribution functions for a spectral noise component and a spectral speech component. These distribution functions overlap depending on the SNR. The overlap is higher, the lower the SNR is. And therefore in this case it can not be decided whether short-term spectra contain speech in spectral regions where the spectral magnitude of the speech takes values of the same size or less size than the noise.
SUMMARY OF THE INVENTION
The present invention provides a method and an apparatus that overcomes the problems and that allows a more robust speech recognition in noisy environment.
It is advantageous according to the invention that a short term spectrum only containing noise is smoothed and in addition in case of noisy speech segments, unreliable spectral components are interpolated by so called reliable ones, therefore resulting in an improved speech recognition, or more explicitly in a robust feature extraction, supporting an improved speech recognition.
It is advantageous to perform the interpolation based on at least one spectral component of an adjacent short term spectrum and/or at least one in time preceding spectral component, as it could be expected that a so called unreliable speech component with a low probability to contain speech is smoothed.
An improved speech recognition is achieved with taking two adjacent spectral components and one proceeding in time.
A further advantage according to the present invention is to compare the calculated probability to a threshold in order to get a definition which spectral component has to be interpolated.
It is further advantageous to interpolate the spectral component on the basis of noiseless speech.
Two interpolations are performed resulting in an even better speech recognition.
It is advantageous according to the present invention to base the division YYY of the short-term spectra on a MEL frequency range as the MEL frequency range is based on the human ear.
Further it is advantageous to use the method for a speech recognition to control electronic devices, e.g. mobile phones, telephones or access system using speech to allow access or dialling etc.


REFERENCES:
patent: 4752956 (1988-06-01), Sluijter
patent: 4897878 (1990-01-01), Boll et al.
patent: 5455888 (1995-10-01), Iyengar et al.
patent: 5668927 (1997-09-01), Chan et al.
Krembel, L., European Search Report, App. No. EP 99 20 3613, May 30, 2000, pp. 1-3.
Matsumoto, H. et al., “Smoothed Spectral Subtraction for a Frequency-Weighted HMM in Noisy Speech Recognition,” Proceedings of the International Conference on Spoken Language Processing, XX,XX, Jan. 1, 1996, pp. 905-908.
Yoma, N.B., et al., “Weighted matching Algorithms and Reliability in Noise Cancelling by Spectral Subtraction” IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), US, Los Alamitos, IEEE Comp., Soc. Press, 1997, pp. 1171-1174.
Soon, I.Y., et al., “Improved Noise Suppression Filter Using Self-adaptive Estimator of Probability of Speech Absence,” Signal Processing, NL, Amsterdam, vol. 75, No. 2, Jun. 1999, pp. 151-159.
Yang, R., et al., “Noise Compensation for Speech Recognition in Car Noise Environments,” Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), US, New York, IEEE, 1995, pp. 433-436.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and apparatus for a robust feature extraction for... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method and apparatus for a robust feature extraction for..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for a robust feature extraction for... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3204379

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.