Method and apparatus for performing double-talk detection...

Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition

Reexamination Certificate

Rate now

[ 0.00 ] – not rated yet Voters 0 Comments 0

Details Method and apparatus for performing double-talk detection... Method and apparatus for performing double-talk detection...

: 2000-03-30
: 2004-08-10
: To, Doris H. (Department: 2655)
: Data processing: speech signal processing, linguistics, language
: Speech signal processing
: Recognition

: C704S246000, C379S406140, C379S406010
: Reexamination Certificate
: active
: 06775653
: ABSTRACT:

FIELD OF THE INVENTION
The present invention is directed to a method and apparatus for performing double-talk detection, and more particularly, to a method and apparatus for performing double-talk detection with adaptive decision thresholding.
BACKGROUND ART
Communications usually include at least two parties and associated hardware. With respect to one set of hardware, the speech from the party co-located with the hardware is termed near-end speech and the speech from the other party is termed far-end speech. Most conventional echo cancellers (which may be used with both sets of hardware) use an adaptive filter to estimate echo path and synthesize an estimated echo signal that is subtracted from a signal Sin, in order to reduce the near-end echo.
FIG. 1
illustrates a conventional echo canceller
10
, including an adaptive FIR filter
12
, which performs a normalized least mean square (NLMS) algorithm, a double-talk detector
14
, which performs speech detection and comparison and a hybrid
16
. In order to correctly estimate the actual echo path from the input (Rout of the echo canceller
10
, usually the same as the echo canceller
10
Rin signal) and output (Sin of the echo canceller
10
) signals, the output of the echo path must originate solely from the input signal. The adaptive FIR filter
12
is easily modified to estimate the echo path if the near-end and the far-end parties speak one at a time. When both parties speak simultaneously, this situation is termed “double-talk”. During double-talk, the output signal contains not only the echo of the input signal, but the near-end speech signal as well.
When near-end speech is present, the adaptation of the filter
12
should be inhibited, otherwise an erroneous estimate of the echo path is obtained, which results in poor echo cancellation. The role of the double-talk detector
14
is to sense when the echo is corrupted by near-end speech and then inhibit the adaptation of the filter
12
. Due to the divergent problems during double-talk situations, the double-talk detector
104
has a large impact on the overall performance of the echo canceller
10
.
Numerous attempts have been made to perform double-talk detection which exploit the spectrum characteristic or the power level information derived from the near-end and far-end signals. For example, the conventional Geigel algorithm as described in D. L. Duttweiler, “A Twelve-Channel Digital Echo Canceller,” IEEE Trans. Commun., Vol. COM-26, pp. 647-653, 1978, which follows the power comparison concept, makes the basic assumption that echo has a much lower power level than the far-end speech signal. Therefore, if the near-end signal power is lower than the far-end speech by a certain threshold (usually 6 dB), the near-end signal is considered echo and the echo canceller tries to cancel it. Otherwise, double-talk is declared and adaptation is prohibited. The Geigel algorithm is very efficient (simple and low computation cost) and fairly effective (adequate for most applications).
However, the basic assumption of the Geigel algorithm is not true in the following cases:
(1) the near-end speaker is speaking with lower volume or excessive loss is introduced in the near-end analog circuits; and
(2) a large volume echo may occur in a mobile or hands-free phone or in some hybrids with severe leakage.
In these cases, the echo canceller may mistake the lower near-end speech as echo and try to cancel it, or mistake the strong echo as the near-end speech and try to keep it.
Another class of double-talk algorithms is the cross-correlation or coherence-based algorithms (denoted here as “CORR-algorithms”), as described in, for example, J. Benesty et al., “A New Class of Double-Talk Detectors Based on Cross-Correlation,” IEEE Trans. Signal Processing, Vol. 46, No. 6, June 1998 and T. Gansler et al., “A Double-Talk Detector Based on Coherence,” IEEE Trans. Commun., Vol. 44, pp. 1421-1427, November 1996, which are based on the assumption that speech signals from different parties are independent through the call, and then use a cross-correlation coefficient vector between the Rout and Sin signals for double-talk detection. Since echoes can usually be approximated as an attenuated and delayed version of their original signals, strong correlation between echoes and their originates should exist. This makes the cross-correlation coefficient vector an efficient measurement for double-talk detection. Compared to the Geigel Algorithm, the CORR-algorithms introduce an extra decision delay of at least one speech frame (usually several hundred samples) in order to reliably estimate the cross-correlation functions. As a result of the lag decision, adaptation also must be delayed in order to avoid severely canceling the initial part of the break-in near-end speech. The CORR-algorithms also are much more computational complex, especially when estimating a coherence function in the spectrum domain.
Other attempts to resolve the double-talk problem can be found in K. Ochiai et al., “Echo Canceller with Two Echo Path Models,” IEEE Trans. Commun., Vol. COM-25, pp. 589-595, June 1977, which uses an echo canceller with two echo path models, or in C. Carlemalm et al., “On Detection of Double-Talk and Changes in the Echo Path Using a Markov Modulated Channel Model,” Proc. Intl. Conf. ASSP, Munich, Germany, Apr. 20-24, 1997, Vol. V, pp. 3869-3872, which uses a Markov modulated channel model.
Each of the above-described detection techniques have at least one common feature; namely a suitable precision threshold is critical, due to the time varying properties of the speech levels, the background noise, and the attenuation of the echo path.
This suggests that a fixed decision threshold is not appropriate and should be replaced by an adaptive decision threshold which is capable of continuously tracking variations during the calls. Furthermore, the parameter estimation and double-talk detection algorithms must be fast in order to prevent the synthesizing filter in the echo canceller from diverging.
SUMMARY OF THE INVENTION
The present invention solves the problems with conventional double-talk detectors and echo cancellers, by providing a double-talk detector and a method of performing double-talk detection, as well as an echo canceller and a method of performing echo cancellation, which utilizes an adaptive threshold. The adaptive threshold is capable of continuously tracking variations during a telephone call, and permits the double-talk detector, echo canceller, and methods of the present application to adjust to the time varying properties of speech levels, background noise and/or the attenuation of the echo path.
In another preferred embodiment, the present invention permits the use of two or more, complementary double-talk detection algorithms. For example, one of the double-talk detection algorithms could be a detection algorithm, such as the Geigel algorithm, which is simple and has low computational cost, and is fairly effective, and the other could be a cross-correlation or coherence-based algorithm, which may be more accurate, but also more computationally complex.
In another embodiment of the present invention, the double-talk detector, echo canceller, and methods of the present application, include processing elements which are frame-based, sample-based, or a combination of both.

REFERENCES:
patent: 4897832 (1990-01-01), Suzuki et al.
patent: 5598468 (1997-01-01), Ammicht et al.
patent: 5602913 (1997-02-01), Lee et al.
patent: 5613037 (1997-03-01), Sukkar
patent: 5619566 (1997-04-01), Fogel
patent: 5663955 (1997-09-01), Iyengar
patent: 5664011 (1997-09-01), Crochiere et al.
patent: 5764753 (1998-06-01), McCaslin et al.
patent: 6192126 (2001-02-01), Koski
patent: 6385176 (2002-05-01), Iyengar et al.
patent: 6611594 (2003-08-01), Benesty et al.
Cho, “An Objective Technique for evaluating Doubletalk Detectors. . . . .Echo Cancelers”, IEEE Transaction for Speech and Audio Processing , vol. 7 #6, Nov. 1999.*
D.L. Duttweiler, “A Twelve-Channel Digital Echo Canceller,” IEEE Trans. Commun., vol. COM-26,

Affiliated with

Wei Xiong Guan

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Also associated with

Opsasnick Michael N.

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

To Doris H.

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and apparatus for performing double-talk detection... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and apparatus for performing double-talk detection..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for performing double-talk detection... will most certainly appreciate the feedback.

Rate now

Comments { 0 }

Profile ID: LFUS-PAI-O-3270560

All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.

Canada

Charities
Companies
MP Candidates
Patents
Employee Salary Disclosure

World

Places of the World
Scientific Papers

United States

Banks
Companies
Counties
Patents
Employee Salary Disclosure