Electrical audio signal processing systems and devices – Binaural and stereophonic
Reexamination Certificate
1999-07-29
2002-07-23
Isen, Forester W. (Department: 2644)
Electrical audio signal processing systems and devices
Binaural and stereophonic
C381S027000
Reexamination Certificate
active
06424719
ABSTRACT:
FIELD OF THE INVENTION
The present invention relates to audio systems, in particular, “3D” audio systems.
BACKGROUND INFORMATION
Conventional 3D audio systems include: (i) a binaural spatializer, which simulates the appropriate auditory experience of one or more sources located around the listener; and (ii) a delivery system, which ensures that the binaural signals are received correctly at the listener's ears. Much work has been done on binaural spatialization and several commercial systems are currently available.
To achieve good reproduction of 3D audio, it is necessary to precisely control the acoustic signals at the listener's ears. One way to do this is to deliver the audio signals through headphones. In many situations, however, it is preferable not to wear headphones. The use of standard stereo loudspeakers is problematic, since there is a significant amount of left and right channel leakage known as “crosstalk”.
Acoustic crosstalk cancellation is a signal processing technique whereby two (or possibly more) loudspeakers are used to deliver 3D audio to a listener, without requiring headphones. The idea is to cancel the crosstalk signal that arrives at each ear from the opposite-side loudspeaker. If this can be successfully achieved, then the acoustic signals at the listener's ears can be controlled, just as if the listener was wearing headphones. A significant problem with existing crosstalk cancellation systems is that they are very sensitive to the position of the listener's head. Although good cancellation can be achieved for the head in a default position, the crosstalk signal is no longer canceled if the listener moves his head; in some cases head movement of only a couple of centimeters can have drastic effects.
With conventional systems, exact cancellation requires perfect knowledge of the acoustic transfer functions (TFs) between the loudspeakers and the listener's ears. These TFs are modeled using an assumed head position and generic head-related transfer functions (HRTFs). (See, for example, D. G. Begault, “3D sound for virtual reality and multimedia,” Academic Press Inc., Boston, 1994.) In practice, however, the real TFs will always differ from the assumed model, most noticeably by the listener's head moving from its assumed position. Any variation between the assumed model and the real environment will result in degradation in the performance of the crosstalk canceler: in some cases this performance degradation can be quite severe.
The only way to know the acoustic TFs exactly is to place microphones in the listener's ears and constantly update the crosstalk cancellation network appropriately. (See, e.g., P. A. Nelson et al., “Adaptive inverse filters for stereophonic sound reproduction”,
IEEE Trans. Signal Processing,
vol. 40, no. 7, pp. 1621-1632, July 1992.) However it may be preferable to use some form of passive head tracking and adaptively update the cancellation network based on the current position of the listener's head. Methods of passive head tracking include: (i) using a head-mounted head tracker; (ii) using a microphone array to determine the head position based on the listener's giving a spoken command (this may require the user to constantly speak to the system); or (iii) using a video camera. Although use of a video camera appears to be the most promising, even with an accurate camera-based head tracker, it is inevitable that there will still be some position errors in addition to errors between the generic HRTFs and the listener's own HRTFs. For these reasons, such a crosstalk canceler will be non-robust in practice.
FIG. 1
is a generalized block diagram of a conventional crosstalk cancellation system as described in U.S. Pat. No. 3,236,949 to Atal and Schroeder. p
L
and p
R
are the left and right program signals respectively, l
1
and l
2
, are the loudspeaker signals, and a
n
R
, n=1, 2 is the transfer function (TF) from the nth loudspeaker to the right ear (a similar pair of TFs for the left ear, denoted by a
n
L
, are not shown). The objective is to find the filter transfer functions h
1
, h
2
, h
3
, h
4
such that: (i) the signals p
L
and p
R
are reproduced at the left and right ears respectively; and (ii) the crosstalk signals are canceled, i.e., none of the p
L
signal is received at the right ear, and similarly, none of the p
R
signal is received at the left ear.
Denoting the signals at the left and right ears as e
L
and e
R
respectively, the block diagram of
FIG. 1
may be described by the following linear system:
[
e
R
e
L
]
=
[
a
1
R
a
2
R
a
1
L
a
2
L
]
⁡
[
h
1
h
3
h
2
h
4
]
⁡
[
p
R
p
L
]
⁢


⁢
e
=
A
⁢
⁢
H
⁢
⁢
p
.
(
1
)
To reproduce the program signals identically at the ears requires that
H=A
−1
. (2)
For simplicity, only the response to the right program channel will be described. The description for the left channel would be similar. In this case, the block diagram in
FIG. 1
reduces to a two-channel beamformer, with filters h
1
and h
2
on the respective channels.
Let the response at the ears be:
[
b
R
b
L
]
=
[
a
1
R
a
2
R
a
1
L
a
2
L
]
⁡
[
h
1
h
2
]
⁢


⁢
b
=
Ah
,
(
3
)
where b
R
=1 (i.e., the right program signal is faithfully reproduced at the right ear), and b
L
=0 (i.e., none of the right program signal reaches the left ear). Assuming the TF matrix A is known and invertible, then the system of equations (3) can be readily solved to find the required filters h. Typically, the TF matrix A is determined (either from measurements on a dummy head, or through calculations using some assumed head model) for a fixed head location (the “design position”). However, if A varies from its design values, then the calculated filters will no longer produce the desired crosstalk cancellation. In practice, variation of A occurs whenever the listener moves his head or when different listeners use the system. This is a fundamental problem with known acoustic crosstalk cancellation systems.
Robustness to head movements is frequency-dependent, and for a given frequency, there is a specific loudspeaker spacing which gives the best performance in terms of robustness. (See D. B. Ward et al., “Optimum loudspeaker spacing for robust crosstalk cancellation”, Proc. IEEE Conf. Acoustic Speech Signal Processing (ICASSP-98), Seattle, May 1998, Vol. 6, pp. 3541-3544.) However, as frequency increases, the loudspeaker spacing required to give good robustness performance becomes impractical. For example, for a head distance of d
H
=0.5 m (typical for a desktop audio system) and a head radius of r
H
=0.0875 m, a loudspeaker spacing of approximately 0.1 m is required. For a more practical loudspeaker spacing of 0.25 m, the conventional crosstalk canceler is extremely non-robust at a frequency of 4 kHz, and head movements of as little as 2 cm can destroy the crosstalk cancellation effect. Thus, for a fixed loudspeaker spacing, the conventional crosstalk canceler becomes inherently non-robust at certain frequencies.
Differences between the assumed TF model and the actual TF model can be considered as perturbations of the acoustic TF matrix A of Eq. 3. These differences include movement of the head from its design position, and differences between different HRTFs. From linear systems theory, the robustness of the system of Eq. 3 to perturbation of a symmetric matrix A is reflected by its condition number, defined for A complex as
cond
⁢
{
A
}
=
σ
max
⁡
(
AA
H
)
σ
min
⁡
(
AA
H
)
(
4
)
where
min
(x) and
max
(x) represent the smallest and largest singular values respectively. For a two-channel crosstalk canceler, A has only two singular values. When A is ill-conditioned, the crosstalk canceler will be sensitive to variations in head position. Thus, it is important to consider under which configurations the matrix A becomes ill-conditioned.
Consider the following model for the TF from the nth loudspeaker to the right ear:
a
n
R
=
ⅇ
j2π
⁢
&
Elko Gary W.
Ward Darren B.
Baker & McKenzie
Isen Forester W.
Pendleton Brian
LandOfFree
Acoustic crosstalk cancellation system does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Acoustic crosstalk cancellation system, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Acoustic crosstalk cancellation system will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2916793