Method for suppressing noise in a digital speech signal

Data processing: speech signal processing – linguistics – language – Speech signal processing – Psychoacoustic

Reexamination Certificate

Rate now

[ 0.00 ] – not rated yet Voters 0 Comments 0

Details Method for suppressing noise in a digital speech signal Method for suppressing noise in a digital speech signal

: 2000-06-05
: 2002-11-05
: Banks-Harold, Marsha D. (Department: 2654)
: Data processing: speech signal processing, linguistics, language
: Speech signal processing
: Psychoacoustic

: C704S205000, C704S226000, C381S094300
: Reexamination Certificate
: active
: 06477489
: ABSTRACT:

BACKGROUND OF THE INVENTION
The present invention relates to digital techniques for suppressing noise in speech signals. It relates more particularly to noise suppression by non-linear spectral subtraction.
Because of the widespread adoption of new forms of communication, in particular mobile telephones, communications are increasingly made in very noisy environments. The noise, added to the speech, then tends to interfere with the communication by preventing optimum compression of the speech signal and creating unnatural background noise. The noise makes understanding the spoken message difficult and tiring.
Many algorithms have been investigated in attempts to reduce the effects of noise in a communication. S. F. Boll (“Suppression of acoustic noise in speech using spectral subtraction”, IEEE Trans. on Acoustics, Speech and Signal Processing, Vol. ASSP-27, No. 2, April 1979) has proposed an algorithm based on spectral subtraction. This technique consists of estimating the spectrum of the noise during phases of silence and subtracting it from the received signal. It reduces the received noise level. Its main defect is that it creates musical noise which is particularly bothersome because it is unnatural.
This work was taken up and improved on by D. B. Paul (“The spectral envelope estimation vocoder”, IEEE Trans. on Acoustics, Speech and Signal Processing, Vol. ASSP-29, No. 4, August 1981) and by P. Lockwood and J. Boudy (“Experiments with a nonlinear spectral subtractor (NSS), Hidden Markov Models and the projection, for robust speech recognition in cars”, Speech Communication, Vol. 11, June 1992, pages 215-228, and EP-A-0 534 837) and has significantly reduced the level of the noise whilst preserving its natural character. Moreover, this contribution had the merit of incorporating the principle of masking into the computation of the noise suppression filter for the first time. Based on this idea, a first attempt was made by S. Nandkumar and J. H. L. Hansen (“Speech enhancement on a new set of auditory constrained parameters”, Proc. ICASSP 94, pages I.1-I.4) to use explicitly computed masking curves in the spectral subtraction. Despite the disappointing results of the above technique, this contribution had the merit of emphasizing the importance of not degrading the speech signal during noise suppression.
Other methods based on breaking the speech signal down into singular values, and thus on projecting the speech signal into a smaller space, were investigated by Bart De Moore (“The singular value decomposition and long and short spaces of noisy matrices”, IEEE Trans. on Signal Processing, Vol. 41, No. 9, September 1993, pages 2826-2838) and by S. H. Jensen et al. (“Reduction of broad-band noise in speech by truncated QSVD”, IEEE Trans. on Speech and Audio Processing, Vol. 3, No. 6, November 1995). The principle of the above technique is to consider the speech signal and the noise signal as totally decorrelated and to consider the speech signal to have sufficient predictability to be predicted on the basis of a restricted set of parameters. This technique produces acceptable noise suppression for highly voiced signals, but totally alters the nature of the speech signal. Faced with relatively coherent noise, such as vehicle tire or engine noise, the noise can be more easily predicted than the unvoiced speech signal. There is then a tendency to project the speech signal into part of the vector space of the noise. The method does not take the speech signal into account, in particular unvoiced speech areas where the predictability is low. Moreover, predicting the speech signal on the basis of a small set of parameters prevents all of the intrinsic richness of speech from being taken into account. The limitations of techniques based only on mathematical considerations and overlooking the particular nature of speech are clear.
Finally, other techniques are based on criteria of coherence. The coherence function is particularly well developed by J. A. Cadzow and O. M. Solomon (“Linear modeling and the coherence function”, IEEE Trans. on Acoustics, Speech and Signal Processing, Vol. ASSP-35, No. 1, January 1987, pages 19-28), and its application to noise suppression has been investigated by R. Le Bouquin (“Enhancement of noisy speech signals: application to mobile radio communications”, Speech Communication, Vol. 18, pages 3-19). This method is based on the fact that the speech signal is significantly more coherent than the noise if a plurality of independent channels is used. The results obtained appear to be fairly encouraging. However, this technique unfortunately requires a plurality of sound pick-up points, which is not always the case.
A main object of the present invention is to propose a new noise suppression technique which takes account of the characteristics of perception of speech by the human ear, so enabling efficient noise suppression without deteriorating the perception of the speech.
SUMMARY OF THE INVENTION
The invention therefore proposes a method of suppressing noise in a digital speech signal processed by successive frames, comprising the steps of:
computing spectral components of the speech signal of each frame;
computing, for each frame, overestimates of spectral components of the noise included in the speech signal;
performing a spectral subtraction including at least a first subtraction step in which a respective first quantity dependent on parameters including the overestimate of the corresponding spectral component of the noise for said frame is subtracted from each spectral component of the speech signal of the frame, to obtain spectral components of a first noise-suppressed signal; and
subjecting the result of the spectral subtraction to a transformation into the time domain to construct a noise-suppressed speech signal.
According to the invention, the spectral subtraction further includes the following steps
computing a masking curve by applying an auditory perception model on the basis of spectral components of the first noise-suppressed signal;
comparing overestimates of the spectral components of the noise for the frame to the computed masking curve; and
a second subtraction step in which a respective second quantity depending on parameters including a difference between the overestimate of the corresponding spectral component of the noise and the computed masking curve is subtracted from each spectral component of the speech signal of the frame.
The second quantity subtracted can in particular be limited to the fraction of the overestimate of the corresponding spectral component of the noise which is above the masking curve. This approach is based on the observation that it is sufficient to suppress audible noise frequencies. In contrast, there is no utility eliminating noise masked by speech.
It is generally desirable to overestimate the spectral envelope of the noise so that the overestimate thereby obtained is robust to sudden variations of the noise. However, excessive overestimation usually has the drawback of distorting the speech signal. This affects the voiced character of the speech signal, eliminating some of its predictability. This drawback is very bothersome in telephony, since it is in the voiced areas that the speech signal then has the most energy. The invention greatly attenuates this drawback by limiting the subtracted quantity if the whole or part of a frequency component of the overestimated noise proves to be masked by the speech.

REFERENCES:
patent: 5151941 (1992-09-01), Nishiguchi et al.
patent: 5228088 (1993-07-01), Kane et al.
patent: 5400409 (1995-03-01), Linhard
patent: 5450522 (1995-09-01), Hermansky et al.
patent: 5469087 (1995-11-01), Eatwell
patent: 5555190 (1996-09-01), Derby et al.
patent: 5717768 (1998-02-01), Laroche
patent: 5742927 (1998-04-01), Crozier et al.
patent: 5839101 (1998-11-01), Vahatalo et al.
patent: 6144937 (2000-11-01), Ali
patent: 0 438 174 (1991-07-01), None
patent: 0 661 821 (1995-07-01), None
patent: 95/02930 (1995-01-01), None
R Le Bouquin et al., <<Enhancement of Noisy Speech Signals: A

Affiliated with

Lockwood Philip

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Lubiarz Stéphane

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Also associated with

Banks-Harold Marsha D.

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

Lerner Martin

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

Matra Nortel Communications

Corporate Assignee

[ 0.00 ] – not rated yet Voters 0 Comments 0

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method for suppressing noise in a digital speech signal does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method for suppressing noise in a digital speech signal, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method for suppressing noise in a digital speech signal will most certainly appreciate the feedback.

Rate now

Comments { 0 }

Profile ID: LFUS-PAI-O-2987493

All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.

Canada

Charities
Companies
MP Candidates
Patents
Employee Salary Disclosure

World

Places of the World
Scientific Papers

United States

Banks
Companies
Counties
Patents
Employee Salary Disclosure