Method and apparatus for performing double-talk detection in...

Telephonic communications – Echo cancellation or suppression

Reexamination Certificate

Rate now

[ 0.00 ] – not rated yet Voters 0 Comments 0

Details Method and apparatus for performing double-talk detection in... Method and apparatus for performing double-talk detection in...

: 2000-07-21
: 2004-07-20
: Tieu, Binh (Department: 2643)
: Telephonic communications
: Echo cancellation or suppression

: C379S406020, C379S406050, C379S406080, C379S406130, C379S417000
: Reexamination Certificate
: active
: 06766019
: ABSTRACT:

FIELD OF THE INVENTION
The present invention relates generally to the field of acoustic echo cancellation and more particularly to an improved method for detecting double-talk in acoustic echo cancellation systems.
BACKGROUND OF THE INVENTION
With the increasingly commonplace use of speakerphones and teleconferencing. acoustic echo cancellation has recently become a topic of critical importance. In particular, an acoustic echo canceller (AEC) ideally removes the undesired echo signal that invariably feeds back from the loudspeaker to the microphone which are used in full-duplex hands-free telecommunications systems. In particular, echo cancellation is performed by modeling the echo path impulse response with an adaptive finite impulse response (FIR) filter, fully familiar to those of ordinary skill in the art, and subtracting the computed echo estimate from the microphone output signal (i.e., the return signal).
FIG. 1
shows a diagram of an illustrative single-channel AEC. (In many cases, stereo echo cancellers are used, but in the context of the instant problem and the present invention, the use of a single-channel teleconferencing system will be adequate for purposes of understanding the invention.) The contents and operation of
FIG. 1
will be described in detail below.
More specifically, an acoustic echo canceller mitigates the echo effect by adjusting the transfer function (i.e., the impulse response characteristic) of the adaptive filter in order to generate an estimate of the unwanted return signal. That is, the filter is adapted to mimic the effective transfer function of the acoustic path from the loudspeaker to the microphone. As such, by filtering the incoming signal (i.e., the signal coming from the far-end—shown as x(n) in FIG.
1
), the output of the filter will estimate the unwanted return signal which comprises the echo (shown as y(n) in FIG.
1
). Then, this estimate is subtracted from the outgoing signal (i.e., the return signal) to produce an error signal (shown as e(n) in FIG.
1
). By adapting the filter impulse response characteristic such that the error signal approaches zero, the echo is advantageously reduced or eliminated. That is, the filter coefficients, and hence the estimate of the unwanted echo, are updated in response to continuously received samples of the error signal for more closely effectuating as complete a cancellation of the echo as possible.
Additionally, double-talk detectors (DTD) are generally used in echo cancellers in order to disable the filter adaptation during double-talk conditions. That is, when both the near end party and the far end party to a conversation taking place across a telecommunications line speak simultaneously, it would be clearly undesirable to attempt to minimize the entire “error signal,” since that signal now also includes the “double-talk” (i.e., the speech of the near-end speaker, shown as v(n) in FIG.
1
). More specifically, the function of a double-talk detector is to recognize that double-talk is occurring, and to stop the filter from further adaptation until the double-talk situation ceases.
The basic double-talk detection scheme typically comprises the computation of a “detection statistic” and the comparison of that statistic with a predetermined threshold. Various prior art methods have been employed to form the detection statistic, each typically using the far-end speech signal, x(n), and the return signal, y(n), as the basis for computing the statistic. (Some approaches use the error signal, e(n) rather than the return signal y(n), which provides essentially the same information.) Obviously, if there were no echo (i.e., the signal from the loudspeaker to the microphone remained totally undisturbed, or equivalently, the effective transfer function, h(n), of the receiving room were unity), and if furthermore there were no background noise, w(n), ill the receiving room, then signals x(n) and y(n) would be identical if and only if there were no double-talk (i.e., x(n)=y(n) it and only if v(n)=0). Since this is not the case, however, the computation of a useful detection statistic must take the presence of the echo, as well as the possible presence of background noise, into account.
More specifically, the generalized procedure for handling double-talk may be described by the following four steps:
1. A detection statistic &xgr;, is formed using the available signals (e.g., x(n), y(n), e(n), etc., and the estimated filter coefficients ĥ);
2. The detection statistics, is compared to a predetermined threshold, T, and double-talk is declared if for example, &xgr;<T;
3. Once double-talk is detected, it is declared to exist for a minimum period of time, T
hold
, during which the filter adaptation is disabled; and
4. If, for example, &xgr;≧T continuously for the interval T
hold
, the filter then resumes adaptation, the comparison of &xgr; to T continues, and double-talk is declared to exist again when, for example, &xgr;<T.
Note that the use of a hold time T
hold
in steps 3 and 4 above is advantageously employed in order to suppress detection dropouts due to the potentially noisy behavior of the detection statistic. Although there are some possible variations, most DTD algorithms have this basic form and differ only in their specific formation of the detection statistic (and the corresponding choice of the threshold, T).
One particular prior art approach to the formation of the detection statistic, fully familiar to those skilled in the art, is due to A. A. Geigel. (See, e.g., D. L. Dutweiler, “A Twelve-Channel Digital Echo Canceller,” IEEE Trans. Commun., vol. 26, no. 5, pp. 647-653, May 1978. ) Although the Geigel technique has proven successful when used in network echo cancellers, it has often provided less than reliable performance when used in an acoustic echo cancellation application. Specifically, the Geigel DTD declares presence of near-end speech whenever
ξ
(
g
)
=
max
⁢
{
&LeftBracketingBar;
x
⁡
(
n
)
&RightBracketingBar;
,
…
⁢

,
&LeftBracketingBar;
x
⁡
(
n
-
L
g
+
1
)
&RightBracketingBar;
}
&LeftBracketingBar;
y
⁡
(
n
)
&RightBracketingBar;
<
T
,
(
1
)
where L
g
and T (the threshold), are suitably chosen constants. This detection scheme is based on a waveform level comparison between the return signal y(n) and the far-end speech x(n), assuming that the near-end speech v(n) at the microphone signal will be typically at the same level, or stronger, than the echo y′(n). The maximum of the L
g
most recent samples of x(n) is taken for the comparison because of the unknown delay in the echo path. The predetermined threshold T compensates for the gain of the echo path response h, and is often set to 2 for network echo cancellers because the hybrid (the echo path) loss is typically about 6 dB or more. For an AEC, however, it is not easy to set a universal threshold to work reliably in all the various situations because the loss through the acoustic echo path can vary greatly depending on many factors. For L
g
, one easy choice is to set it the same as the adaptive filter length L since we can assume that the echo path is covered by this length.
Another prior art technique is to form the detection statistic based on the cross-correlation coefficient vector between the signals x(n) and e(n). (See, e.g., H. Ye et a(., “A New Double-Talk Detection Algorithm Based on the Orthogonality Theorem,” IEEE Trans. Commun., vol. 39, pp. 1542-1545, November 1991. ) In fact, using the cross-correlation coefficient vector between x(n) and y(n), rather than between x(n) and e(n), has actually proven more robust and reliable. Specifically, the cross-correlation coefficient vector between x(n) and y(n) is defined as:
c
xy
(
1
)
=
⁢
E
⁢
{
x
⁡
(
n
)
⁢
y
⁡
(
n
)
}
E
⁢
{
x
2
⁡
(
n
)
}
⁢
E
⁢
{
y
2
⁡
(
n
)
}
=
⁢
r
xy
σ
x
⁢
σ
y
=
⁢
[
c
xy
,
0
(
1
)
c
xy
,
1
(
1
)
⋯
c
xy
,
L
-
1
(
1
)
]
T
(
2
)
where E{·} denotes mathematical expectation and c
xy,i
(1)
is the cross-correlation coefficient

Affiliated with

Benesty Jacob

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Gaensler Tomas Fritz

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Also associated with

Agere Systems Inc.

Corporate Assignee

[ 0.00 ] – not rated yet Voters 0 Comments 0

Botos Richard J.

Attorney

[ 0.00 ] – not rated yet Voters 0 Comments 0

Brown Kenneth M.

Attorney

[ 0.00 ] – not rated yet Voters 0 Comments 0

Pham Tuan

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

Tieu Binh

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and apparatus for performing double-talk detection in... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and apparatus for performing double-talk detection in..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for performing double-talk detection in... will most certainly appreciate the feedback.

Rate now

Comments { 0 }

Profile ID: LFUS-PAI-O-3226747

All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.

Canada

Charities
Companies
MP Candidates
Patents
Employee Salary Disclosure

World

Places of the World
Scientific Papers

United States

Banks
Companies
Counties
Patents
Employee Salary Disclosure