Electrical audio signal processing systems and devices – Binaural and stereophonic – Pseudo stereophonic
Reexamination Certificate
1999-06-14
2003-06-10
Lee, Ping (Department: 2644)
Electrical audio signal processing systems and devices
Binaural and stereophonic
Pseudo stereophonic
C381S017000
Reexamination Certificate
active
06577736
ABSTRACT:
This invention relates to a method of synthesising a three dimensional sound field.
The processing of audio signals to reproduce a three dimensional sound-field on replay to a listener having two ears has been a goal for inventors for many years. One approach has been to use many sound reproduction channels to surround the listener with a multiplicity of sound sources such as loudspeakers. Another approach has been to use a dummy head having microphones positioned in the auditory canals of artificial ears to make sound recordings for headphone listening. An especially promising approach to the binaural synthesis of such a sound-field has been described in EP-B-0689756, which describes the synthesis of a sound-field using a pair of loudspeakers and only two signal channels, the sound-field nevertheless having directional information allowing a listener to perceive sound sources appearing to lie anywhere on a sphere surrounding the head of a listener placed at the centre of the sphere.
A monophonic sound source can be digitally processed via a “Head-Response Transfer Function” (HRTF), such that the resultant stereo-pair signal contains natural 3D-sound cues as shown in FIG.
1
. The HRTF can be implemented using a pair of filters, one associated with a left ear response and the other with a right ear response, sometimes called a binaural placement filter. These sound cues are introduced naturally by the acoustic properties of the head and ears when we listen to sounds in real life, and they include the inter-aural amplitude difference (IAD), inter-aural time difference (ITD) and spectral shaping by the outer ear. When this stereo signal pair is introduced efficiently into the appropriate ears of the listener, by headphones say, then he or she perceives the original sound to be at a position in space in accordance with the spatial location associated with the particular HRTF which was used for the signal-processing.
When one listens through loudspeakers instead of headphones, as is shown in
FIG. 2
, then the signals are not conveyed efficiently into the ears, for there is “transaural acoustic crosstalk” present which inhibits the 3D-sound cues. This means that the left ear hears a little of what the right ear is hearing (after a small, additional time-delay of around 0.25 ms), and vice versa as shown in FIG.
3
. In order to prevent this happening, it is known to create appropriate “crosstalk cancellation” or “crosstalk compensations” signals from the opposite loudspeaker. These signals are equal in magnitude and inverted (opposite in phase) with respect to the crosstalk signals, and designed to cancel them out. There are more advanced schemes which anticipate the secondary (and higher order) effects of the cancellation signals themselves contributing to secondary crosstalk, and the correction thereof, and these methods are known in the prior art. A typical prior-art scheme (after M R Schroeder “Models of hearing”, Proc. IEEE, vol. 63 issue 9, [1975] pp. 1332-1350) is shown in FIG.
4
.
When the HRTF processing and crosstalk cancellation are carried out sequentially (
FIG. 5
) and correctly, and using high quality HRTF source data, then the effects can be quite remarkable. For example, it is possible to move the image of a sound-source around the listener in a complete horizontal circle, beginning in front, moving around the right-hand side of the listener, behind the listener, and back around the left-hand side to the front again. It is also possible to make the sound source move in a vertical circle around the listener, and indeed make the sound appear to come from any selected position in space. However, some particular positions are more difficult to synthesise than others, some it is believed for psychoacoustic reasons, and some for practical reasons.
For example, the effectiveness of sound sources moving directly upwards and downwards is greater at the sides of the listener (azimuth=90°) than directly in front (azimuth=0°). This is probably because there is more left-right difference information for the brain to work with. Similarly, it is difficult to differentiate between a sound source directly in front of the listener (azimuth=0°) from a source directly behind the listener (azimuth=180°). This is because there is no time-domain information present for the brain to operate with (ITD=0), and the only other information available to the brain, spectral data, is somewhat similar in both of these positions. In practise, there is more high frequency (HF) energy perceived when the source is in front of the listener, because the high frequencies from frontal sources are reflected into the auditory canal from the rear wall of the concha, whereas from a rearward source, they cannot diffract around the pinna sufficiently.
In practical terms, a limiting feature in the reproduction of 3D-sound from two loudspeakers is the adequacy of the transaural crosstalk cancellation, and there are three significant factors here, as follows.
1. HRTF quality. The quality of the 30° HRTF (
FIG. 3
) used to derive the cancellation algorithms (
FIG. 4
) is important. Both the artificial head from which they derive and the measurement methodology must be adequate.
2. Signal-processing algorithms. The algorithms must be executed effectively.
3. HF effects. In theory, it is possible to carry out “perfect” crosstalk cancellation, but not in practise. Setting aside the differences between individual listeners and the artificial head from which the algorithm HRTFs derive, the difficulties relate to the high frequency components, above several kHz. When optimal cancellation is arranged to occur at each ear of the listener, the crosstalk wave and the cancellation wave combine to form a node. However, the node exists only at a single point in space, and as one moves further away from the node, then the two signals are no longer mutually time-aligned, and so the cancellation is imperfect. For gross misalignment, then the signals can actually combine to create a resultant signal which is greater at certain frequencies than the original, unwanted crosstalk itself. However, in practise, the head acts as an effective barrier to the higher frequencies because of its relative size with respect to the wavelengths in question, and so the transaural crosstalk is limited naturally, and the problem is not as bad as might be expected.
There have been several attempts to limit the spatial dependency of crosstalk cancellation systems at these higher frequencies. Cooper and Bauck (U.S. Pat. No. 4,893,342) introduced a high-cut filter into their crosstalk cancellation scheme, so that the HF components (>8 kHz or so) were not actually cancelled at all, but were simply fed directly to the loudspeakers, just as they are in ordinary stereo. The problem with this is that the brain interprets the position of the HF sounds (i.e. “localises” the sounds) to be where the loudspeakers themselves are, because both ears hear correlating signals from each individual speaker. It is true that these frequencies are difficult to localise accurately, but the overall effect is nevertheless to create HF sounds of frontal origin for all required spatial positions, and this inhibits the illusion when trying to synthesise rearward-positioned sounds.
Even when the crosstalk is optimally cancelled at high frequencies, the listener's head is never guaranteed to be exactly correctly positioned, and so again, the non-cancelled HF components are “localised” by the brain at the speakers themselves, and therefore can appear to originate in front of the listener, making rearward synthesis difficult to achieve.
The following additional practical aspects also hinder optimal transaural crosstalk cancellation:
1. The loudspeakers often do not have well-matched frequency responses.
2. The audio system may not have well-matched L-R gain.
3. The computer configuration (software presets) may be set so as to have inaccurate L-R balance.
Many sound sources which are used in computer games contain predominantly low
Central Research Laboratories Limited
Lee Ping
Oliff & Berridg,e PLC
LandOfFree
Method of synthesizing a three dimensional sound-field does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method of synthesizing a three dimensional sound-field, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method of synthesizing a three dimensional sound-field will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3110281