Data processing: speech signal processing – linguistics – language – Speech signal processing – Psychoacoustic
Reexamination Certificate
2000-06-14
2003-04-01
McFadden, Susan (Department: 2654)
Data processing: speech signal processing, linguistics, language
Speech signal processing
Psychoacoustic
C704S500000, C704S503000, C704S229000
Reexamination Certificate
active
06542863
ABSTRACT:
BACKGROUND OF THE INVENTION
This patent application is related to U.S. patent application Ser. No. 09/595,387 entitled “A FAST CODE LENGTH SEARCH METHOD FOR MPEG AUDIO ENCODING” filed Jun. 14, 2000; and is related to U.S. patent application Ser. No. 09/595,391, entitled “A FAST LOOP ITERATION AND BITSTREAM FORMATTING METHOD FOR MPEG AUDIO ENCODING” filed Jun. 14, 2000; the disclosures of which are herein incorporated by reference.
1. Field of the Invention
The present invention relates generally to the field of audio encoding, and more particularly to a fast codebook search method for finding an optimal Huffman codebook from among a group of Huffman codebooks, wherein the method is especially suited for MPEG-compliant audio encoding.
2. Description of the Related Art
In general, an audio encoder processes a digital audio signal and produces a compressed bit stream suitable for storage. A standard method for audio encoding and decoding is specified by “CODING OF MOVING PICTURES AND ASSOCIATED AUDIO OR DIGITAL STORAGE MEDIA AT UP TO ABOUT 1.5 MBIT/s, Part 3 Audio” (3-11171 rev 1), submitted for approval to ISO-IEC/JTC1 SC29, and prepared by SC29/WG11, also known as MPEG (Moving Pictures Expert Group). This draft version was adopted with some modifications as ISO/IEC 11172-3:1993(E) (hereinafter “MPEG-1 Audio Encoding”). The disclosure of these MPEG-1 Audio Encoding standard specifications are herein incorporated by reference. This standard is also often referred to as “MP3” or “MP3 audio encoding.” The exact encoder algorithm is not standardized, and a compliant system may use various means for encoding such as estimation of the auditory masking threshold, quantization, and scaling. However, the encoder output must be such that a decoder conforming to the MPEG-1 standard will produce audio suitable for an intended application.
As shown in
FIG. 1
, input audio samples are fed into the encoder
2
. The mapping stage
4
creates a filtered and sub-sampled representation of the input audio stream. The mapped samples may be called either sub-band samples (as in Layer I, see below) or transformed sub-band samples (as in Layer III). A psychoacoustic model
10
creates a set of data to control the quantizer and coding block
6
. The data supplied by the psychoacoustic model
10
may vary depending on the actual coder implementation
6
. One possibility is to use an estimation of a masking threshold to do this quantizer control. The quantizer and coding block
6
creates a set of coding symbols from the mapped input samples. Again, the actual implementation of the quantizer and coder block
6
can depend on the encoding system. The frame packing block
8
assembles the actual bit stream from the output data of the other blocks, and adds other information (e.g. error correction) if necessary.
In general, as shown in
FIG. 3
, each quantized data frame
30
contains 576 data samples. Each frame
30
is divided into three sub-regions
32
,
34
,
36
, with each region containing an even number of data samples, and with at least on region further divided into sub-regions. Adjacent data samples
38
, or “data pairs” are used as X, Y coordinates into a Huffman codebook, which provides a single code value for each data pair, as illustrated in
FIG. 4. A
codebook is a table containing bit codes for encoding the data pairs and a code length value. For certain regions, the data may be encoded in groups of four data samples (quadruples) instead of pairs. The MPEG-1 standard uses 32 different codebooks, of which two or three are candidates for each sub-region, depending on the maximum data value in each sub-region. The “optimal” codebook for each sub-region is the single codebook from among the candidate codebooks that uses the fewest number of total bits to code the entire sub-region.
Depending on the application, different layers of the coding system having increasing encoder complexity and performance can be used. An ISO MPEG Audio Layer N decoder is able to decode bit stream data that has been encoded in Layer N and all layers below N, as described below:
Layer I
This layer contains the basic mapping of the digital audio input into 32 sub-bands, fixed segmentation to format the data into blocks, a psychoacoustic model to determine the adaptive bit allocation, and quantization using block companding and formatting.
Layer II
This layer provides additional coding of bit allocation, scale factors and samples, and a different framing is used.
Layer III
This layer introduces increased frequency resolution based on a hybrid filter bank. It adds a different (non-uniform) quantizer, adaptive segmentation and entropy coding of the quantized values.
Joint stereo coding can be added as an additional feature to any of the layers.
A decoder
12
accepts the compressed audio bit stream, decodes the data elements, and uses the information to produce digital audio output, as shown in FIG.
2
. The bit stream data is fed into the decoder
12
. Then, the bit stream unpacking and decoding block
14
performs error detection, if error-checking has been applied by the encoder
2
. The bit stream data is unpacked to recover the various pieces of information. The reconstruction block
16
reconstructs the quantized version of the set of mapped samples. The inverse mapping block
18
transforms these mapped samples back into uniform PCM (pulse code modulation).
As originally envisioned by the drafters of the MPEG audio encoder specification, the encoder would be implemented in hardware. Hardware implementations provide dedicated processing, but generally have limited available memory. For software MPEG encoding and decoding implementations such as software programs running on Intel Pentium™ class microprocessors, however, the need for greater processing efficiency has arisen, while the memory restrictions are less critical. Specifically, in prior art solutions, the processing time associated with selecting an optimal codebook from among a group of candidate codebooks is much too long.
SUMMARY OF THE INVENTION
The present invention is a fast codebook search method for finding an optimal Huffman codebook from a group of Huffman codebooks, wherein the method is especially suited for MPEG-compliant audio encoding. In order to select an optimal codebook from among candidate codebooks for a given sub-region, a bit difference table is created, which for any given data pair (or quadruple) contains a bit difference value. The bit difference value is the difference between the number of bits needed for a given data pair (or quadruple) in a first candidate codebook and a second candidate codebook [N bits−M bits]. By summing all such bit difference values for the data samples in a given sub-region, a quick determination can be made as to which codebook would encode the sub-region using the fewest bits (based on the size and/or sign of the sum(s)). For sub-regions having three candidate codebooks, two bit difference sums are calculated. For an implementation of the MPEG-1 Layer III Audio Encoding standard, only 20 bit difference tables are required in order to cover every possible combination of codebook candidates.
Thus, the present invention determines the optimal codebook for each sub-region by merely summing the bit difference values from the appropriate bit difference table. This allows for a quicker determination, with much fewer calculations than required by the prior art approach. Since the procedure is performed within an “inner loop” iteration, the present invention reduces the required computation time by about 50% for two codebooks in the group, and approximately 33% if there are three codebooks.
REFERENCES:
patent: 5227788 (1993-07-01), Johnston et al.
patent: 5341457 (1994-08-01), Hall, II et al.
patent: 5535300 (1996-07-01), Hall, II et al.
patent: 5559722 (1996-09-01), Nickerson
patent: 5663725 (1997-09-01), Jang
patent: 5748121 (1998-05-01), Romriell
patent: 5848195 (1998-12-01), Romriell
patent: 5923376 (1999-07-01), Pullen et al.
patent: 5956674 (1999-09-01), Smyth et al.
patent: 5974380 (1999
Intervideo Inc.
McFadden Susan
Reed Smith Crosby Heafey LLP
LandOfFree
Fast codebook search method for MPEG audio encoding does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Fast codebook search method for MPEG audio encoding, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Fast codebook search method for MPEG audio encoding will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3062321