Entrophy code mode switching for frequency-domain audio coding

Coded data generation or conversion – Digital code to digital code converters – To or from bit count codes

Reexamination Certificate

Rate now

[ 0.00 ] – not rated yet Voters 0 Comments 0

Details Entrophy code mode switching for frequency-domain audio coding Entrophy code mode switching for frequency-domain audio coding

: 1998-12-14
: 2001-10-09
: Jeanpierre, Peguy (Department: 2819)
: Coded data generation or conversion
: Digital code to digital code converters
: To or from bit count codes

: C704S503000
: Reexamination Certificate
: active
: 06300888
: ABSTRACT:

FIELD OF THE INVENTION
The invention generally relates to frequency domain audio coding, and more specifically relates to entropy coding methods used in frequency domain audio encoders and decoders.
BACKGROUND
In a typical audio coding environment, data is formatted, if necessary (e.g., from an analog format) into a long sequence of symbols which is input to an encoder. The input data is encoded by an encoder, transmitted over a communication channel (or simply stored), and decoded by a decoder. During encoding, the input is pre-processed, sampled, converted, compressed or otherwise manipulated into a form for transmission or storage. After transmission or storage, the decoder attempts to reconstruct the original input.
Audio coding techniques can be categorized into two classes, namely the time-domain techniques and frequency-domain ones. Time-domain techniques, e.g., ADPCM, LPC, operate directly in the time domain while the frequency-domain techniques transform the audio signals into the frequency domain where compression is performed. Frequency-domain codecs (compressors/decompressors) can be further separated into either sub-band or transform coders, although the distinction between the two is not always clear. That is, sub-band coders typically use bandpass filters to divide an input signal into a small number (e.g., four) of sub-bands, whereas transform coders typically have many sub-bands (and therefore a correspondingly large number of transform coefficients).
Processing an audio signal in the frequency domain is motivated by both classical signal processing theories and the human psychoaoustics model. Psychoacoustics take advantage of known properties of the listener in order to reduce information content. For example, the inner ear, specifically the basilar membrane, behaves like a spectral analyzer and transforms the audio signal into spectral data before further neural processing proceeds. Frequency-domain audio codecs often take advantage of auditory masking that is occurring in the human hearing system by modifying an original signal to eliminate information redundancies. Since human ears are incapable of perceiving these modifications, one can achieve efficient compression without distortion.
Masking analysis is usually conducted in conjunction with quantization so that quantization noise can be conveniently “masked”. In modern audio coding techniques, the quantized spectral data are usually further compressed by applying entropy coding, e.g., Huffman coding. Compression is required because communication channels usually have limited available capacity or bandwidth. It is frequently necessary to reduce the information content of input data in order to allow it to be reliably transmitted, if at all, over the communication channel.
Tremendous effort has been invested in developing lossless and lossy compression techniques for reducing the size of data to transmit or store. One popular lossless technique is Huffman encoding, which is a particular form of entropy encoding. Entropy coding assigns code words to different input sequences, and stores all input sequences in a code book. The complexity of entropy encoding depends on the number m of possible values an input sequence X may take. For small m, there are few possible input combinations, and therefore the code book for the messages can be very small (e.g., only a few bits are needed to unambiguously represent all possible input sequences). For digital applications, the code alphabet is most likely a series of binary digits {0, 1 }, and code word lengths are measured in bits.
If it is known that input is composed of symbols having equal probability of occurring, an optimal encoding is to use equal length code words. But, it is not typical that an input stream has equal probability of receiving any particular message. In practice, certain messages are more likely than others, and entropy encoders take advantage of such data correlation to minimize the average length of code words among expected inputs. Traditionally, however, fixed length input sequences are assigned variable length codes (or conversely, variable length sequences are assigned fixed length codes).
SUMMARY
The invention relates to a method for selecting an entropy coding mode for frequency-domain audio coding. In particular, a given input stream representing audio input is partitioned into frequency ranges according to some statistical criteria derived from a statistical analysis of typical or actual input to be encoded. Each range is assigned an entropy encoder optimized to encode that range's type of data. During encoding and decoding, a mode selector applies the correct entropy method to the different frequency ranges. Partition boundaries can be decided in advance, allowing the decoder to implicitly know which decoding method to apply to encoded data. Or, a forward adaptive arrangement may be used, in which boundaries are flagged in the output stream by indicating a change in encoding mode for subsequent data.
For natural sounds, such as speech and music, information content is concentrated in the low frequency range. This means that, statistically, the lower frequencies will have more non-zero energy values (after quantization), while the higher frequency range will have more zero values to reflect the lack of content in the higher frequencies. This statistical analysis can be used to define one or more partition boundaries separating lower and higher frequency ranges. For example, a single partition can be defined such that the lower ¼ of the frequency components are below the partition. Alternatively, one can set the partition so that approximately one-half of the critical bands are in each defined frequency band. (Critical bands are frequency ranges of non-uniform width that correspond to the human auditory system's sensitivity to particular frequencies.) The result of such a division is to define two frequency ranges, in which one contains predominately non-zero frequency coefficients, while the other contains predominately zero frequency coefficients. Advance knowledge that the ranges containing predominately zero and non-zero values can be encoded with encoders optimized for such zero and non-zero values.
In one implementation, the range containing predominately zero values is encoded with a multi-level run-length encoder (RLE), i.e., an encoder that statistically correlates sequences of zero values with one or more non-zero symbols and assigns variable length code words to arbitrarily long input sequences of such zero and non-zero values. Similarly, the range containing mostly non-zero values is encoded with a variable-to-variable entropy encoder, where a variable length code word is assigned to arbitrarily long input sequences of quantization symbols. An overall more efficient process is achieved by basing coding methods according to the properties of the input data. In practice, the number of partitions and frequency ranges will vary according to the type of data to be encoded and decoded.

REFERENCES:
patent: 4133006 (1979-01-01), Linuma
patent: 4706265 (1987-11-01), Furukawa
patent: 4744085 (1988-05-01), Fukatsu
patent: 5003307 (1991-03-01), Whiting et al.
patent: 5227788 (1993-07-01), Johnston et al.
patent: 5400075 (1995-03-01), Savatier
patent: 5479562 (1995-12-01), Fielder et al.
patent: 5550541 (1996-08-01), Todd
patent: 5552832 (1996-09-01), Astle
patent: 5579430 (1996-11-01), Grill et al.
patent: 5644305 (1997-07-01), Inoue et al.
patent: 5742735 (1998-04-01), Eberlein et al.
patent: 5790706 (1998-08-01), Auyeung
patent: 5819215 (1998-10-01), Dobson et al.
patent: 5825830 (1998-10-01), Kopf
patent: 5831559 (1998-11-01), Agarwal et al.
patent: 5845247 (1998-12-01), Miyasaka
patent: 5883589 (1999-03-01), Takashima et al.
patent: 5884269 (1999-03-01), Cellier et al.
patent: 5926791 (1999-07-01), Ogata et al.
patent: 5946043 (1999-08-01), Lee et al.
patent: 5959560 (1999-09-01), Said et al.
patent: 6029126 (2000-02-01), Malvar
patent: 6223162 (2001-04-01), Chen et al.
patent: 0283735 A2 (1988-09-01),

Affiliated with

Chen Wei-ge

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Lee Ming-Chieh

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Also associated with

Jean-Pierre Peguy

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

Klarquist & Sparkman, LLP

Law Firm

[ 0.00 ] – not rated yet Voters 0 Comments 0

Microsoft Corporation

Corporate Assignee

[ 0.00 ] – not rated yet Voters 0 Comments 0

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Entrophy code mode switching for frequency-domain audio coding does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Entrophy code mode switching for frequency-domain audio coding, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Entrophy code mode switching for frequency-domain audio coding will most certainly appreciate the feedback.

Rate now

Comments { 0 }

Profile ID: LFUS-PAI-O-2610464

All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.

Canada

Charities
Companies
MP Candidates
Patents
Employee Salary Disclosure

World

Places of the World
Scientific Papers

United States

Banks
Companies
Counties
Patents
Employee Salary Disclosure