Multi-level run length coding for frequency-domain audio coding

Data processing: speech signal processing – linguistics – language – Audio signal time compression or expansion

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S501000

Reexamination Certificate

active

06223162

ABSTRACT:

FIELD OF THE INVENTION
The present invention relates to setting one or more thresholds to divide an input signal into multiple regions, and applying entropy encoding to encode a variable length first region with a variable length entropy code, and applying a second coding method to a second region.
BACKGROUND
In a typical audio coding environment, data is represented as a long sequence of symbols which is input to an encoder. The input data is encoded by an encoder, transmitted over a communication channel (or simply stored), and decoded by a decoder. During encoding, the input is pre-processed, sampled, converted, compressed or otherwise manipulated into a form for transmission or storage. After transmission or storage, the decoder attempts to reconstruct the original input.
Audio coding techniques can be generally categorized into two classes, namely the time-domain techniques and frequency-domain ones. Time-domain techniques, e.g., ADPCM, LPC, operate directly in the time domain while the frequency-domain techniques transform the audio signals into the frequency domain where compression is performed. Frequency-domain codecs (compressors/decompressors) can be further separated into either subband or transform coders, although the distinction between the two is not always clear. That is, sub-band coders typically use bandpass filters to divide an input signal into a small number (e.g., four) of sub-bands, whereas transform coders typically have many sub-bands (and therefore a correspondingly large number of transform coefficients). Processing an audio signal in the frequency domain is motivated by both classical signal processing theories and human perception psychoaoustics model.
Psychoacoustics take advantage of known properties of the listener in order to reduce information content. For example, the inner ear, specifically the basilar membrane, behaves like a spectral analyzer and transforms the audio signal into spectral data before further neural processing proceeds. Frequency-domain audio codecs often take advantage of auditory masking that is occurring in the human hearing system by modifying an original signal to eliminate information redundancies. Since human ears are incapable of perceiving these modifications, one can achieve efficient compression without distortion.
Masking analysis is usually conducted in conjunction with quantization so that quantization noise can be conveniently “masked.” In modern audio coding techniques, the quantized spectral data are usually further compressed by applying entropy coding, e.g., Huffman coding. Compression is required because communication channels usually have limited available capacity or bandwidth. It is frequently necessary to reduce the information content of input data in order to allow it to be reliably transmitted, if at all, over the communication channel.
Tremendous effort has been invested in developing lossless and lossy compression techniques for reducing the size of data to transmit or store. One popular lossless technique is Huffman encoding, which is a particular form of entropy encoding. Entropy coding assigns code words to different input sequences, and stores all input sequences in a code book. The complexity of entropy encoding depends on the number m of possible values an input sequence X may take. For small m, there are few possible input combinations, and therefore the code book for the messages can be very small (e.g., only a few bits are needed to unambiguously represent all possible input sequences). For digital applications, the code alphabet is most likely a series of binary digits {0, 1}, and code word lengths are measured in bits.
If it is known that input is composed of symbols having equal probability of occurring, an optimal encoding is to use equal length code words. But, it is not typical that an input stream has equal probability of receiving any particular message. In practice, certain messages are more likely than others, and entropy encoders take advantage of such data correlation to minimize the average length of code words among expected inputs. Traditionally, however, fixed length input sequences are assigned variable length codes (or conversely, variable length sequences are assigned fixed length codes).
By their nature, however, most compression techniques for audiovisual data are lossy processes. The level of quality and fidelity delivered in sound and video files depends primarily on how much bandwidth is available and whether the compressor/de-compressor (codec) is optimized to prepare output for an available bandwidth.
SUMMARY
The invention relates to a method for entropy coding information relating to frequency domain audio coefficients. In particular, the invention relates to a form of multi-level encoding audio spectral frequency coefficients. Illustrated embodiments are particularly adapted to coding environments in which multiple encoding methods can be chosen based on statistical profiles for frequency coefficient ranges. Encoding methods can be optimized for portions of a frequency spectrum, such as portions having a predominate value.
In illustrated embodiments, the predominate value in certain frequency ranges are zero value frequency coefficients, and they are encoded with the multi-level run-length encoder (RLE) encoder. The multi-level encoder statistically correlates sequences of zero values with one or more non-zero symbols and assigns variable length code words to arbitrarily long input sequences of such zero and non-zero values. The RLE based entropy encoder uses a specialized code book generated with respect to the probability of receiving an input sequence of zero-valued spectral coefficients followed by a non-zero spectral coefficient. If code book size is an issue, the probabilities can be sorted and less probable input sequences excluded from the code book.
Similarly, the range containing mostly non-zero values is encoded with a variable-to-variable entropy encoder, where a variable length code word is assigned to arbitrarily long input sequences of quantization symbols. An overall more efficient process is achieved by basing coding methods according to the properties of the input data. In practice, the number of partitions and frequency ranges will vary according to the type of data to be encoded and decoded.


REFERENCES:
patent: 4706265 (1987-11-01), Furukawa
patent: 4744085 (1988-05-01), Fukatsu
patent: 5400075 (1995-03-01), Savatier
patent: 5550541 (1996-08-01), Todd
patent: 5552832 (1996-09-01), Astle
patent: 5579430 (1996-11-01), Grill et al.
patent: 5644305 (1997-07-01), Inoue et al.
patent: 5819215 (1998-10-01), Dobson et al.
patent: 5825830 (1998-10-01), Kopf
patent: 5831559 (1998-11-01), Agarwal et al.
patent: 5883589 (1999-03-01), Takashima et al.
patent: 5884269 (1999-03-01), Cellier et al.
patent: 5946043 (1999-08-01), Lee et al.
patent: 5959560 (1999-09-01), Said et al.
patent: 6029126 (2000-02-01), Malvar
patent: 0283735 A2 (1988-09-01), None
patent: 0612156 A2 (1989-04-01), None
patent: 0535571 A2 (1993-04-01), None
patent: 0540350 A2 (1993-05-01), None
patent: 0830029 A2 (1998-03-01), None
patent: 62-247626 (1987-10-01), None
patent: 09232968 (1997-09-01), None
patent: WO 98/40969 (1998-09-01), None
Tewfik et al., “Enhanced Wavelet Based Audio Coder,” Proc. ASILOMAR Conference, IEEE (1993).
ISO/IEC 13818-7: Information Technology—Generic coding of moving pictures and associated audio information—Part 7: Advanced Audio Coding, pp. I-iv, 1-147 (1997).
ISO/IEC 13818-7: Information Technology—Generic coding of moving pictures and associated audio information—Part 7: Advanced Audio Coding, Technical Corrigendum 1, pp. 1-22 (Dec. 1998).
Fogg, “Survey of Software and Hardware VLC Architectures,” SPIE vol. 2186, pp. 29-37 (no date of publication).
Gibson et al.,Digital Compression of Multimedia, “Lossless Source Coding,” Chapter 2, pp. 17-62 (1998).
Gibson et al.,Digital Compression of Multimedia, “Universal Lossless Source Coding,” Chapter 3, pp. 63-112 (1998).
Gibson et al.,Digital Compression of Multimedia, “MPEG Audio,” Chapter 11.4,

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Multi-level run length coding for frequency-domain audio coding does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Multi-level run length coding for frequency-domain audio coding, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Multi-level run length coding for frequency-domain audio coding will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2456387

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.