Electrical audio signal processing systems and devices – Binaural and stereophonic – Stereo speaker arrangement
Reexamination Certificate
1997-06-18
2001-06-05
Isen, Forester W. (Department: 2747)
Electrical audio signal processing systems and devices
Binaural and stereophonic
Stereo speaker arrangement
C381S017000, C381S001000
Reexamination Certificate
active
06243476
ABSTRACT:
BACKGROUND OF THE INVENTION
Three-dimensional audio systems create an “immersive” auditory environment, where sounds can appear to originate from any direction with respect to the listener. Using “binaural synthesis” techniques, it is currently possible to deliver three-dimensional audio scenes through a pair of loudspeakers or headphones. Using loudspeakers involves greater complexity due to interference between acoustic outputs that does not occur with headphones. Consequently, a loudspeaker implementation requires not only synthesis of appropriate directional cues, but also further processing of the signals so that, in the acoustic output, sounds that would interfere with the spatial illusion provided by these cues are canceled. Existing systems require the listener to assume a fixed position with respect to the loudspeakers, because the cancellation functions correctly only in this orientation. If the listener moves outside a narrow equalization zone or “sweet spot,” the illusion is lost.
It is well known that directional cues are embodied in the transformation of sound pressure from the free field to the ears of a listener; see Jens Blauert,
Spatial Hearing
(1983). A “head-related transfer function” (HRTF) represents a measurement of this transformation for a specific sound location relative to the listener's head, and describes the diffraction of sound by the torso, head, and external ear (pinna). Consequently, a pair of HRTFs, based on a known or assumed spatial location of the sound source, process sound signals so they appear to the listener to emanate from the source location—that is, the HRTFs produce a “binaural” signal.
It is straightforward to synthesize directional cues by convolving a sound with the appropriate HRTFs, thereby creating a synthetic binaural signal. When this is done using HRTFs designed for a particular listener, localization performance essentially matches free-field listening; see Wightman et al.,
J. Acoust. Soc. Am
. 85(2):858-867 and 868-878 (1989). The use of non-individualized HRTFs-that is, HRTFs designed generically and not for a particular listener—results in poorer localization performance, particularly regarding front-back confusion and elevation judgments; see Wenzel et al.,
J. Acoust. Soc. Am
. 94(1):111-123 (1993).
The sound travelling from a loudspeaker to the listener's opposite ear is called “crosstalk,” and results in interference with the directional components encoded in the loudspeaker signals. That is, for each ear, sounds from the contralateral speaker will interfere with binaural signals from the ipsilateral speaker unless corrective steps are taken. Loudspeaker-based binaural systems, therefore, require crosstalk-cancellation systems. Such systems typically model sound emanating from the speakers and reaching the ears is using transfer functions; in particular, the transfer functions from two speakers to two ears form a 2×2 system transfer matrix. Crosstalk cancellation involves pre-filtering the signals with the inverse of this matrix before sending the signals to the speakers; in this way, the contralateral output is effectively canceled for each of the listener's ears.
Crosstalk cancellation using non-individualized head models (i.e., HRTFs) is only effective at low frequencies, where considerable similarity exists between the head responses of different individuals (since at low frequencies the wavelength of sound approaches or exceeds the size of a listener's head). Despite this limitation, existing crosstalk-cancellation systems are quite effective at producing realistic three-dimensional sound images, particularly for laterally located sources. This is because the low-frequency interaural phase cues are of paramount importance to sound localization; when conflicting high- and low-frequency localization cues are presented to a subject, the sound will usually be perceived at the position indicated by the low-frequency cues (see Wightman et al.,
J. Acoust. Soc. Am
. 91(3):1648-1661 (1992)). Accordingly, the cues most critical to sound localization are the ones most effectively treated by crosstalk cancellation.
Existing crosstalk-cancellation systems usually assume a symmetric listening situation, with the listener located directly between the speakers and facing forward. The assumption of symmetry leads to simplified implementations, such as the shuffler topology described in Cooper et al.,
J. Audio Eng Soc
. 37(1/2):3-19 (1989). One can compensate for a laterally displaced listener by delaying and attenuating one of the output channels (see U.S. Pat. Nos. 4,355,203 and 4,893,342). It is also possible to reformat the loudspeaker signals for different loudspeaker spread angles, as described, for example, in the '342 patent. It has not, however, been possible to maintain a binaural signal for a moving listener, or even for one whose head rotates.
SUMMARY OF THE INVENTION
The present invention extends the concept of three-dimensional audio to a moving listener, allowing, in particular, for all types of head motions (including lateral and frontback motions, and head rotations). This is accomplished by tracking head position and incorporating this parameter into an enhanced model of binaural synthesis.
Accordingly, in a first aspect, the invention comprises a tracking system for detecting the position and, preferably, the angle of rotation of a listener's head; and means for generating a binaural signal for broadcast through a pair of loudspeakers, the acoustical presentation being perceived by the listener as three-dimensional sound—that is, as emanating from one or more apparent, predetermined spatial locations. In particular, the system includes a crosstalk canceller that is responsive to the tracking system, and which adds to the binaural signal a crosstalk cancellation signal based on the position (and/or the rotation angle) of the listener's head. The crosstalk canceller may be implemented in a recursive or feedforward design. Furthermore, the invention may compute the appropriate filter, delay, and gain characteristics directly from the output of the tracking system, or may instead be implemented as a set of filters (or, more typically, filter functions) pre-computed for various listening geometries, the appropriate filters being activated during operation as the listener moves; the system is also capable of interpolating among the pre-computed filters to more precisely accommodate user movements (not all of which will result in geometries coinciding with those upon which the pre-computed filters are based).
In a second aspect, the invention addresses the high-frequency components not generally affected by the crosstalk canceller. Moreover, since the wavelengths involved are small, cancellation of these frequencies cannot be accomplished using a nonindividualized head model; attempts to cancel high-frequency crosstalk can actually sound worse than simply passing the high frequencies unmodified. Indeed, even when using an individualized head model, the high-frequency inversion becomes critically sensitive to positional errors because the size of the equalization zone is proportional to the wavelength. In the context of the present invention, however, high frequencies can prove problematic, interfering with dynamic localization by a moving listener. The invention addresses high-frequency interference by considering these frequencies in terms of power (rather than phase). By implementing the compensation in terms of power levels rather than phase adjustments, the invention avoids the shortcomings heretofore encountered in attempting to cancel high-frequency crosstalk.
Moreover, this approach is found to maintain the “power panning” property. As sound is panned to a particular speaker, the listener expects power to emanate from the directionally appropriate speaker; to the extent power output from the other speaker does not diminish accordingly, the power panning property is violated. The invention retains the appropriate power ratio for high frequencies using, for exampl
Isen Forester W.
Massachusetts Institute of Technology
Pendleton B. T.
Testa Hurwitz & Thibeault LLP
LandOfFree
Method and apparatus for producing binaural audio for a... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and apparatus for producing binaural audio for a..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for producing binaural audio for a... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2533761