Data processing: speech signal processing – linguistics – language – Speech signal processing – Psychoacoustic
Reexamination Certificate
2000-06-14
2004-01-13
McFadden, Susan (Department: 2655)
Data processing: speech signal processing, linguistics, language
Speech signal processing
Psychoacoustic
C704S500000, C704S503000
Reexamination Certificate
active
06678648
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates generally to the field of audio encoding, and more particularly to a fast loop iteration and bitstream formatting method, wherein the method is especially suited for MPEG-compliant audio encoding.
2. Description of the Related Art
In general, an audio encoder processes a digital audio signal and produces a compressed bit stream suitable for storage. A standard method for audio encoding and decoding is specified by “CODING OF MOVING PICTURES AND ASSOCIATED AUDIO OR DIGITAL STORAGE MEDIA AT UP TO ABOUT 1.5 MBIT/s, Part
3
Audio” (3-11171 rev 1), submitted for approval to ISO-IEC/JTC1 SC29, and prepared by SC29/WG11, also known as MPEG (Moving Pictures Expert Group). This draft version was adopted with some modifications as ISO/IEC 11172-3:1993(E) (hereinafter “MPEG-1 Audio Encoding”). The disclosure of these MPEG-1 Audio Encoding standard specifications are herein incorporated by reference. This standard is also often referred to as “MP3” or “MP3 audio encoding.” The exact encoder algorithm is not standardized, and a compliant system may use various means for encoding such as estimation of the auditory masking threshold, quantization, and scaling. However, the encoder output must be such that a decoder conforming to the MPEG-1 standard will produce audio suitable for an intended application.
As shown in
FIG. 1
, input audio samples are fed into the encoder
2
. The mapping stage
4
creates a filtered and sub-sampled representation of the input audio stream. The mapped samples may be called either sub-band samples (as in Layer I, see below) or transformed sub-band samples (as in Layer III). A psychoacoustic model
10
creates a set of data to control the quantizer and coding block
6
. The data supplied by the psychoacoustic model
10
may vary depending on the actual coder implementation
6
. One possibility is to use an estimation of a masking threshold to do this quantizer control. The quantizer and coding block
6
creates a set of coding symbols from the mapped input samples. Again, the actual implementation of the quantizer and coder block
6
can depend on the encoding system. The frame packing block
8
assembles the actual bit stream from the output data of the other blocks, and adds other information (e.g. error correction) if necessary.
In general, as shown in
FIG. 3
, each quantized data frame
30
contains 576 data samples. Each frame
30
is divided into three sub-regions
32
,
34
,
36
, with each region containing an even number of data samples, and with at least one region further divided in sub-regions. Adjacent data samples
38
, or “data pairs” are used as X, Y coordinates into a Huffman codebook, which provides a single code value for each data pair, as illustrated in
FIG. 4. A
codebook is a table containing bit codes for encoding the data pairs and a code length value. For certain regions, the data may be encoded in groups of four (quadruples) instead of pairs. The MPEG-1 standard uses 32 different codebooks, of which two or three are candidates for each sub-region, depending on the maximum data value in each sub- region. The “optimal” codebook for each sub-region is the single codebook from among the candidate codebooks that uses the fewest number of total bits to code the entire sub-region.
Depending on the application, different layers of the coding system having increasing encoder complexity and performance can be used. An ISO MPEG Audio Layer N decoder is able to decode bit stream data which has been encoded in Layer N and all layers below N, as described below:
Layer I:
This layer contains the basic mapping of the digital audio input into 32 sub-bands, fixed segmentation to format the data into blocks, a psychoacoustic model to determine the adaptive bit allocation, and quantization using block companding and formatting.
Layer II:
This layer provides additional coding of bit allocation, scale factors and samples, and a different framing is used.
Layer III:
This layer introduces increased frequency resolution based on a hybrid filter bank. It adds a different (non-uniform) quantizer, adaptive segmentation and entropy coding of the quantized values.
Joint stereo coding can be added as an additional feature to any of the layers.
A decoder
12
accepts the compressed audio bit stream, decodes the data elements, and uses the information to produce digital audio output, as shown in FIG.
2
. The bit stream data is fed into the decoder
12
. Then, the bit stream unpacking and decoding block
14
performs error detection, if error-checking has been applied by the encoder
2
. The bit stream data is unpacked to recover the various pieces of information. The reconstruction block
16
reconstructs the quantized version of the set of mapped samples. The inverse mapping block
18
transforms these mapped samples back into uniform PCM (pulse code modulation).
As originally envisioned by the drafters of the MPEG audio encoder specification, the encoder would be implemented in hardware. Hardware implementations provide dedicated processing, but generally have limited available memory. For software MPEG encoding and decoding implementations, such as software programs running on Intel Pentium™ class microprocessors, the need for greater processing efficiency has arisen, while the memory restrictions are less critical. Specifically, in prior art solutions, it is inefficient to repeatedly calculate the absolute values of the samples within an inner iteration loop.
SUMMARY OF THE INVENTION
In general, the present invention performs a sign and an absolute value calculation outside of the quantization inner loop, thereby reducing redundant calculations. The stored sign and absolute values can also be used in the frame packing block, also increasing processing efficiency. Thus, the present invention improves the performance of an MPEG encoder. The method of the present invention may be incorporated into the a standard MPEG audio encoder in order to improve the processing efficiency of the encoder.
REFERENCES:
patent: 5227788 (1993-07-01), Johnston et al.
patent: 5341457 (1994-08-01), Hall, II et al.
patent: 5535300 (1996-07-01), Hall, II et al.
patent: 5559722 (1996-09-01), Nickerson
patent: 5663725 (1997-09-01), Jang
patent: 5748121 (1998-05-01), Romriell
patent: 5809474 (1998-09-01), Park
patent: 5848195 (1998-12-01), Romriell
patent: 5864802 (1999-01-01), Kim et al.
patent: 5923376 (1999-07-01), Pullen et al.
patent: 5956674 (1999-09-01), Smyth et al.
patent: 5974380 (1999-10-01), Smyth et al.
patent: 5978762 (1999-11-01), Smyth et al.
patent: 6223192 (2001-04-01), Oberman et al.
patent: 6256653 (2001-07-01), Juffa et al.
patent: 6295009 (2001-09-01), Goto
patent: 6300888 (2001-10-01), Chen et al.
patent: 6542863 (2003-04-01), Surucu
patent: 6601032 (2003-07-01), Surucu
Moving Pictures Expert Group, “Coding of Moving Pictures and Associated Audio or Digital Storage Media at up to About 1.5 MBIT/s, Part 3 Audio” 3-11171 rev 1.
Moving Pictures Expert Group, “Coding of Moving Pictures and Associated Audio or Digital Storage Media at up to About 1.5 MBIT/s, Part 3 Audio” ISO/IEC 11172-3:1993(E).
Intervideo Inc.
Johnson Doyle B.
McFadden Susan
Reed Smith Crosby Heafey
LandOfFree
Fast loop iteration and bitstream formatting method for MPEG... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Fast loop iteration and bitstream formatting method for MPEG..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Fast loop iteration and bitstream formatting method for MPEG... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3230670