Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition
Reexamination Certificate
1999-08-04
2002-02-19
Dorvil, Richemond (Department: 2641)
Data processing: speech signal processing, linguistics, language
Speech signal processing
Recognition
C704S226000
Reexamination Certificate
active
06349278
ABSTRACT:
FIELD OF THE INVENTION
The present invention relates generally to a method for estimating a speech signal in the presence of noise and, more particularly, to soft decision signal estimation method for generating a soft estimate of a speech signal contained in a received signal.
BACKGROUND OF THE INVENTION
One function of the digital communication system is to transmit a speech signal from a source to a destination. The speech signal is often corrupted by noise which complicates and degrades the performance of coding, detection, and recognition algorithms. This problem is particular severe in mobile communication systems where numerous common sources of noise exist. For example, common noise sources in a mobile communication system include engine noise, background music, environmental noise (such as noise from an open window), and background speech from other persons. The efficiency of coding and recognition algorithms depends on being able to efficiently and accurately estimate both the speech and noise components of a received signal. There are many approaches presented in the literature to solve this problem. Among those, spectral subtraction is one of the most popular techniques because the speech signal is quasi-stationary, and the algorithm can be implemented efficiently using the Fast Fourier Transform (FFT).
The spectral subtraction method for signal estimation is based on the assumption that speech is present. When transmitted over the communication channel, the speech signal is corrupted by noise. The signal observed at the receiving end is the mixture of the speech signal and noise signal. The received signal is filtered in the frequency domain by a filter, such as a matched filter, that attempts to minimize the noise component in the received signal. The output of the matched filter is the estimate of the speech signal based on the assumption that speech was transmitted.
A filter commonly used in a signal detector is a Wiener filter, which minimizes the mean square error between the transmitted speech signal and the signal estimate. The Wiener filter uses the power spectral density (PSD) of the speech signal and noise signal to produce an estimate of the speech signal. Because the speech and noise signals are combined in the received signal, it is generally not possible to calculate the power spectral density of the speech signal and noise signal simultaneously. However, in a voice communication system, such as a mobile communication system, the speech signal is not present at all times. Thus, the power spectral density of the noise signal can be estimated during the time that the speech is absent. Assuming that changes in the noise signal are slow, the power spectral density of the speech signal can be calculated during the time that speech is present by subtracting the power spectral density of the noise signal (calculated when speech was not present) from the power spectral density of the received signal. This technique for calculating the power spectral density of the speech signal assumes that the speech signal and noise signal are independent, which is not always correct.
In order to estimate the power spectral density of the noise signal and speech signal, a voice activity detector (VAD) is used to detect the presence of speech in the received signal. In a conventional VAD, the received signal input to the VAD is filtered, squared, and summed in order to measure the power of the signal during a given time period. The VAD produces an estimate {circumflex over (&thgr;)} indicating whether speech is present. In a conventional detector, a hard decision is made, meaning that {circumflex over (&thgr;)} takes on a value of 1 when speech is present and a value of 0 when speech is not present. The output of the Wiener filter is multiplied by {circumflex over (&thgr;)}. Consequently, a final estimate of the speech signal ŝ(k) is output only when {circumflex over (&thgr;)} equals one. This method of signal estimation is known as hard decision estimation.
In hard decision signal estimation, errors made by the voice activity detector can result in significant error in final estimate of the speech signal. For example, assume that a signal containing speech is received but is not detected by the voice activity detector. In this case, the speech signal will not be output from the signal detector.
Soft decision signal estimation was explored in R J McAulay and M L Loupes, S
PEECH
E
NHANCEMENT
U
SING
A S
OFT
D
ECISION
N
OISE
S
UPPRESSION
F
ILTER,
IEEE. Trans. in Acoustics Speech and Signal Processing,
ASSB-28:137-145, 1980. This article describes a signal estimation technique where the estimate {circumflex over (&thgr;)} is not restricted to 1 or 0, but can be any number in the range 0 to 1. However, the soft decision signal estimation technique described in the article is based on the assumption that the speech signal is a deterministic signal with unknown magnitude and phase. In fact, speech is a random process so the model to estimate the speech signal is not appropriate. Therefore, the signal estimation technique described in the article is not optimal for detection of a speech signal.
SUMMARY OF THE INVENTION
The present invention is a soft decision signal estimation algorithm for generating an estimate of a speech signal from a received signal containing both speech and noise components. The received signal is converted to the frequency domain by a Fast Fourier Transform (FFT). In the frequency domain, the received signal is filtered by a Wiener filter to eliminate, as much as possible, the noise component of the signal. The output signal from the Wiener filter is converted back to the time domain by an inverse FFT. The output signal from the Wiener filter is then combined in the time domain with a speech probability estimate generated by a voice activity detector (VAD) to obtain a soft estimate of the speech signal.
A voice activity detector is used to compute the speech probability estimate. In conventional signal estimation, the VAD detects whether the received signal contains a speech component and outputs a hard decision (i.e. 0 or 1). In the present invention, the VAD generates a soft estimate of the probability of speech, called the speech probability estimate, that is combined with the output of the Wiener filter to obtain a soft estimate of the speech signal. To compute the speech probability estimate, the VAD computes a likelihood ratio based on the received signal. The likelihood ratio and the a priori probability of speech are used to compute the speech probability estimate. The likelihood ratio is also used to determine when to update the frequency response of the Wiener filter and VAD filter.
REFERENCES:
patent: 5012519 (1991-04-01), Adlersberg et al.
patent: 5251263 (1993-10-01), Andrea et al.
patent: 5511009 (1996-04-01), Pastor
patent: 5577161 (1996-11-01), Pelaez
patent: 5630015 (1997-05-01), Kane et al.
patent: 5768473 (1998-06-01), Eatwell et al.
patent: 5839101 (1998-11-01), Vahatalo et al.
patent: 5918204 (1999-06-01), Tsurumaru
patent: 5974373 (1999-10-01), Chan et al.
patent: 6023674 (2000-02-01), Mekuria
patent: 0784311 (1996-11-01), None
ICASSP-95. George, “Single sensor speech enhancement using a soft decision.variable attenuation algorithm” pp. 816-819 vol. 1 May 1995.*
1998 URSI International Synposium on Signals, Systems and Electronics. Ibrahim et al., “Iterative decoding and soft interference cancellation for the Gaussian multiple access channell” pp. 156-161. Oct. 1998.*
“Speech Enhancement Using a Soft-Decision Noise Suppression Filter” by Robert J. McAulay, Member, IEEE, and Marilyn L. Malpass—IEEE Transactions On Acoustics, Speech, and Signal Processing, vol. ASSP-28, No. 2, Apr. 1980.
Sohn, Jongseo and Sung, Wonyong; “A Voice Activity Detector Employing Soft Decision Based Noise Spectrum Adaptation”; IEEE, 6/98, pp. 365-368. Dec. 1998.
Krasny Leonid
Nguyen Truong
Oraintara Soontorn
Coats & Bennett P.L.L.C.
Dorvil Richemond
Ericsson Inc.
LandOfFree
Soft decision signal estimation does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Soft decision signal estimation, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Soft decision signal estimation will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2953169