Data processing: speech signal processing – linguistics – language – Audio signal bandwidth compression or expansion
Reexamination Certificate
1999-08-19
2002-03-12
Dorvil, Richemond (Department: 2741)
Data processing: speech signal processing, linguistics, language
Audio signal bandwidth compression or expansion
C704S503000
Reexamination Certificate
active
06356870
ABSTRACT:
STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT
Not Applicable
BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates to multi-channel digital audio decoders for digital storage media and transmission media.
2. Description of the Related Art
As efficient multi-channel digital audio signal coding methods have been developed for storage or transmission applications such as the digital video disc (DVD) player and the high definition digital TV receiver (set-top-box). A description of one such method can be found in the ATSC Standard, “Digital Sudio Compression (AC-3) Standard”, Document A/52, 20 Dec. 1995. The standard defines a coding method for up to six channels of multi-channel audio, that is, left, right, centre, surround left, surround right, and the low frequency effects (LFE) channel. Techniques of this type can be applied in general to code any number of channels of related or even unrelated audio data into single or multiple representations (bitstreams).
In the ATSC(AC-3) method, the input multi-channel digital audio source is compressed block by block at the encoder by first transforming each block of time domain audio samples into frequency coefficients using an analysis filter bank, then quantizing the resulting frequency coefficients into quantized coefficients with a determined bit allocation strategy, and finally formatting and packing the quanitzed coefficients and bit allocation information into a bitstream for storage or transmission.
Furthermore, depending upon the spectral and temporal characteristics of each channel in the audio source, the transformation of each audio channel block may be performed adaptively at the encoder to optimize the frequency/time resolution. This is achieved by adaptive switching between two transformations with long transform block length or shorter transform block length. The long transform block length which has good frequency resolution is used for improved coding performance, and the shorter transform block length which has greater time resolution is used for audio input signals which change rapidly in time.
At the decoder, each audio block is decompressed from the bitstreams by first determining the bit allocation information, then unpacking and de-quantizing the quantized coefficients, and inverse transforming the resulting frequency coefficients based on determined long or shorter transform length to output time domain audio PCM data. The decoding processes are performed for each channel in the multi-channel audio data.
For reasons such as an overall system cost constraint or physical limitation such as the number of output loudspeakers that can be used, downmixing of the decoded multi-channel audio may be performed so that the number of output channels at the decoder is reduced. Basically, downmixing is performed such that the multi-channel audio information is fully or partially preserved while the number of output channel is reduced. For example, multi-channel coded audio bitstreams may be decoded and mixed down to two output channels, the left and right channel, suitable for conventional stereo audio amplifier and loudspeakers systems. One method of downmixing may be described as:
A
i
=
∑
j
=
0
m
⁢
(
a
ij
×
CH
j
)
where
i: the selected output audio channel number
j: input audio channel number
m: the total number of input audio channels
A
i
: i-th output audio channel
CH
j
: j-th input audio channel
a
ij
: downmixing coefficient for the i-th output and j-th input audio channel
The downmixing method or coefficients may be designed such that the original or the approximate of the original decoded multi-channel signals may be derived from the mixed down channels.
The complexity or cost of decoding for such current art multi-channel audio decoder is more or less proportional to the number of coded audio channels within the input bitstream. In particular, the inverse transform process, which is computationally the most intensive module of the audio decoder and incurs a much higher cost to implement compared to other processes within the audio decoder, is performed on every block of audio in every audio channel. For example, a six channel audio decoder would have about three times the complexity or cost of decoding compared to a stereo (two channel) audio decoder with the same decoding process for each audio channel.
BRIEF SUMMARY OF THE INVENTION
It is an object of this invention to provide a method and apparatus for decoding a bitstream of transform coded multi-channel audio data which will overcome or at least ameliorate, the foregoing disadvantages of the prior art.
One factor that affects the complexity or implementation cost of the mentioned inverse transform is the arithmetic precision used within the process. The precision adopted in this module has a direct relation to the cost (in terms of the amount of RAM/ROM required) and complexity in implementation. Also, the inverse transform is the most demanding stage in terms of introduction of round off noise. Generally, the higher the precision used within the inverse transform process, the higher the implementation cost and the output quality; and vice versa, the lower the precision used within the inverse transform process, the lower the implementation cost and the output quality.
Arithmetic precision considerations in the Inverse Transform involve the word size of the frequency coefficients and the twiddle factors used in each stage, as well as the intermediate data retained between stages. The frequency coefficients generated by the data decoding stage are retained to the degree of accuracy defined by the precision required.
On the other hand, the audio channels represented within the multi-channel audio bitstream may have different perceptual importance relative to the actual audio contents. For examples, a surround effect channel may have relatively less perceptual importance compared to a main channel, or an audio block with shorter transform block length which has audio signals that change rapidly in time may have less frequency resolution requirement compared to an audio block with long transform block length.
By matching different precision for the inverse transform process within the multi-channel audio decoder with the audio contents within the coded multi-channel audio bitstream, the overall complexity or implementation cost of the decoder can be optimized.
According to a first aspect, this invention provides a method for decoding a bitstream of transform coded multi-channel audio data comprising the steps of:
(a) subjecting said bitstream to a block decoding process to obtain for each input audio channel within said multi-channel audio data a corresponding block of frequency coefficients;
(b) assigning to each said block of frequency coefficients a higher precision inverse transform or a lower precision inverse transform according to predetermined characteristics of said audio data represented by the block;
(c) subjecting each said block of frequency coefficients to higher precision inverse transform process of lower precision inverse transform process;
(d) generating a respective output audio signal in response to each said higher precision inverse transform process and each said lower precision inverse transform process.
In a second aspect, this invention provides an apparatus for decoding a bitstream of transform coded multi-channel audio data comprising:
(a) block decoding means to produce for each input audio channel within the said multi-channel audio data a corresponding block of frequency coefficients;
(b) means for assigning to each said block of frequency coefficients a higher precision inverse transform or a lower precision inverse transform according to predetermined characteristics of said audio data represented by the block;
(c) means for subjecting each said block of frequency coefficients according to said assigned higher precision inverse transform process or lower precision inverse transform process;
(d) means for generating a respective output audio signal in response to each said higher p
George Sapna
Hui Yau Wai Lucas
Dorvil Richemond
Iannucci Robert
Johnson Liba
SEED IP Law Group PLLC
STMicroelectronics Asia Pacific PTE Limited
LandOfFree
Method and apparatus for decoding multi-channel audio data does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and apparatus for decoding multi-channel audio data, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for decoding multi-channel audio data will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2874558