Data processing: speech signal processing – linguistics – language – Speech signal processing – For storage or transmission
Reexamination Certificate
1999-07-23
2002-09-17
Chawan, Vijay B (Department: 2654)
Data processing: speech signal processing, linguistics, language
Speech signal processing
For storage or transmission
C704S226000, C704S227000, C704S228000, C704S233000
Reexamination Certificate
active
06453289
ABSTRACT:
FIELD OF THE INVENTION
The invention relates to noise reduction and voice activity detection in speech communication systems.
BACKGROUND OF THE INVENTION
The presence of background noise in a speech communication system affects its perceived grade of service in a number of ways. For example, significant levels of noise can reduce intelligibility, cause listener fatigue, and degrade performance of the speech compression algorithm used in the system.
Reduction of background noise levels can mitigate such problems and enhance overall performance of the speech communication system. In the highly competitive area of communications, improved voice quality is becoming an increasingly important concern to customers when making purchasing decisions. Since noise reduction can be an important element for overall improved voice quality, noise reduction can have a critical impact on these decisions.
Voice encoding and decoding devices (hereinafter referred to as “codecs”) are used to encode speech for more efficient use of bandwidth during transmission. For example, a code excited linear prediction (CELP) codec is a stochastic encoder which analyzes a speech signal and models excitation frames therein using vectors selected from a codebook. The vectors or other parameters can be transmitted. These parameters can then be decoded to produce synthesized speech. CELP is particularly useful for digital communication systems wherein speech quality, data rate and cost are significant issues.
A need exists for a noise reduction algorithm which can enhance the performance of a codec. Noise reduction algorithms often use a noise estimate. Since estimation of noise is performed during input signal segments containing no speech, reliable noise estimation is important for noise reduction. Accordingly, a need also exists for a reliable and robust voice activity detector.
SUMMARY OF THE INVENTION
In accordance with an aspect of the present invention, a noise reduction algorithm is provided to overcome a number of disadvantages of a number of existing speech communication systems such as reduced intelligibility, listener fatigue and degraded compression algorithm performance.
In accordance with another aspect of the present invention, a noise reduction algorithm employs spectral amplitude enhancement. Processes such as spectral subtraction, multiplication of noisy speech via an adaptive gain, spectral noise subtraction, spectral power subtraction, or an approximated Wiener filter, however, can also be used.
In accordance with another aspect of the present invention, noise estimation in the noise reduction algorithm is facilitated by the use of information generated by a voice activity detector which indicates when a frame comprises noise. An improved voice activity detector is provided in accordance with an aspect of the present invention which is reliable and robust in determining the presence of speech or noise in the frames of an input signal.
In accordance with yet another aspect of the present invention, wherein gain for the noise reduction algorithm is determined using a smoothed noise spectral estimate and smoothed input noisy speech spectra. Smoothing is performed using critical bands comprising frequency bands corresponding to the human auditory system.
In accordance with still yet another aspect of the present invention, the noise reduction algorithm can be either integrated in or used with a codec. A codec is provided having voice activity detection and noise reduction functions integrated therein. Noise reduction can coexist with a codec in a pre-compression or post-compression configuration.
In accordance with another aspect of the present invention, background noise in the encoded signal is reduced via swirl reduction techniques such as identifying spectral outlier segments in an encoded signal and replacing line spectral frequencies therein with weighted average line spectral frequencies. An upper limit can also be placed on the adaptive codebook gain employed by the encoder for those segments identified as being spectral outlier segments. A constant C and a lower limit K are selected for use with the gain function to control the amount of noise reduction and spectral distortion introduced in cases of low signal to noise ratio.
In accordance with another aspect of the present invention, a voice activity detector is provided to facilitate estimation of noise in a system and therefore a noise reduction algorithm using estimated noise such as to determine a gain function.
In accordance with yet another aspect of the present invention, the voice activity detector determines pitch lag and performs periodicity detection using enhanced speech which has been processed to reduce noise therein.
In accordance with still yet another aspect of the present invention, the voice activity detector subjects input speech to automatic gain control.
In accordance with an aspect of the present invention, a voice activity detector generates short-term and long-term voice activity flags for consideration in detecting voice activity.
In accordance with yet another aspect of the present invention, a noise flag is generated using an output from a voice activity detector and is provided as an input to the noise reduction algorithm.
In accordance with another aspect of the present invention, an integrated coder is provided with noise reduction algorithm via either a post-compression or a pre-compression scheme.
REFERENCES:
patent: 4868867 (1989-09-01), Davidson et al.
patent: 4969192 (1990-11-01), Chen et al.
patent: 5133013 (1992-07-01), Munday
patent: 5388182 (1995-02-01), Benedetto et al.
patent: 5432859 (1995-07-01), Yang et al.
patent: 5550924 (1996-08-01), Helf et al.
patent: 5687285 (1997-11-01), Katayanagi et al.
patent: 5706394 (1998-01-01), Wynn
patent: 5734789 (1998-03-01), Swaminathan et al.
patent: 5737695 (1998-04-01), Lagerqvist et al.
patent: 5742927 (1998-04-01), Crozier et al.
patent: 5749067 (1998-05-01), Barrett
patent: 5774837 (1998-06-01), Yeldener et al.
patent: 5774839 (1998-06-01), Shlomot
patent: 5774846 (1998-06-01), Morii
patent: 5826224 (1998-10-01), Gerson et al.
patent: 5890108 (1999-03-01), Yeldener
patent: 5899968 (1999-05-01), Navarro et al.
patent: 5937377 (1999-08-01), Hardimann et al.
patent: 6230123 (2001-05-01), Mekuria et al.
Manfred R. Schroeder, “Code-Exicted Linear Prediction (CELP): High-Quality Speech at Very Low Bit Rates”, Proc. ICASSP '85, pp. 937-940, 1985.
Walter Etter, “Noise Reduction by Noise-Adaptive Spectral Magnitude Expansion”, J. Audio Eng. Soc., vol. 42, No. 5, May 1994.
Peter M. Clarkson and Sayed F. Bahgat, “Envelope expansion methods for speech enhancement”, J. Acoust. Soc. Am., vol. 89, No. 3, Mar. 1991.
Bertram Scharf, “Critical Bands”, Foundations of Modern Auditory Theory, J.V. Tobias ed., Academic Press, 1970.
Ertem Filiz Basbug
Nandkumar Srinivas
Swaminathan Kumar
Chawan Vijay B
Hughes Electronics Corporation
Sales Michael W.
Whelan John T.
LandOfFree
Method of noise reduction for speech codecs does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method of noise reduction for speech codecs, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method of noise reduction for speech codecs will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2883332