Efficient de-quantization in a digital video decoding...

Pulse or digital communications – Bandwidth reduction or expansion – Television or motion video signal

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C708S402000

Reexamination Certificate

active

06507614

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to the field of computer controlled multi-media audio visual display. More specifically, the present invention relates to an efficient decoding process for decoding audio/video material represented as a digital bit stream encoded using the Digital Video (DV) standard.
2. Related Art
Audio/visual (AV) material is increasingly stored, transmitted and rendered using digital data. Digital video representation of AV material facilitates its usage with computer controlled electronics and also facilitates high quality image and sound reproduction. Digital AV material is typically compressed (“encoded”) in order to reduce the computer resources required to store and transmit the digital data. Digital AV material can be encoded using a number of well known standards including, for example, the DV (Digital Video) standard, the MPEG (Motion Picture Expert Group) standard and the JPEG standard. The encoding standards also specify the associated decoding processes as well.
The DV decoding process includes a sub-step called “inverse quantization” which is also called “de-quantization.” Inverse quantization is a difficult part of the DV decoding process because the inverse quantization table that is used in DV decoding is not a pre-loaded matrix, as in MPEG decoding. Therefore, the quantization matrix used in DV decoding needs to be computed for each new 8×8 pixel (or “data”) block.
For example,
FIG. 1
illustrates a step in the inverse quantization process of a DV decoder. For 8×8-DCT (Discrete Cosine Transform) mode, an input 8×8 block of data
10
is multiplied by an 8×8 quantization matrix
20
to produce an 8×8 DCT matrix of coefficients
30
. Each X coefficient (or “pixel”) of matrix
10
is multiplied by its associated Q coefficient of matrix
20
to produce a resultant coefficient in the 8×8 DCT matrix
30
. The 8×8 DCT matrix
30
is the output of the inverse quantization of the input pixel block
10
. However, each quantization coefficient (Qij) for each associated pixel (Xij) in the 8×8 matrix
10
is dynamically calculated based on certain parameters thereby making this computation very difficult to implement in a SIMD (Single Instruction Multiple Data) architecture.
Traditional general purpose processors perform inverse quantization in DV decoding using a very straight-forward but time consuming solution. For instance, in the prior art, the de-quantization coefficient (e.g., Qij) of each pixel element (e.g., Xij) is computed one-by-one, in a serial fashion, and then multiplied by its associated pixel value (e.g., Xij) and the result is stored in the DCT matrix
30
. This is done serially for each of the 64 coefficients (X
00
-X
77
). That means, for each pixel (e.g., Xij) of the 8×8 block
10
, at least one load instruction, one store instruction and one multiply (or shift) instruction are needed. This does not even include the time required to create the quantization coefficients (Qij) for each pixel (Xij) which are obtained from macroblock and block parameters. Therefore, using the conventional approach described above, it takes the general purpose processor more than 200 instructions to completely process one 8×8 data block
10
through inverse quantization to create the DCT matrix
30
.
Considering that DV decoding should be done in real-time to avoid image jitter and other forms of visual and/or audio artifacts with respect to the AV material, what is desired is a more efficient mechanism and method for performing inverse quantization to produce a DCT matrix
30
within a DV decoder.
SUMMARY OF THE INVENTION
Accordingly, the present invention provides a more efficient mechanism and method for performing inverse quantization within a DV decoder to produce a DCT matrix. The present invention performs up to eight multiply instructions in parallel for multiplying eight pixels (X) against eight quantization coefficients (Q) to simultaneously produce eight DCT coefficients using, in one embodiment, a 64-bit SIMD type media instruction set (and architecture) and a special quantization matrix. In another embodiment, a 128-bit SIMD type media instruction set (and architecture) can be used.
An efficient digital video (DV) decoder process is described herein that utilizes a specially constructed quantization matrix allowing an inverse quantization subprocess to perform parallel computations, e.g., using SIMD (Single Instruction Multiple Data) processing, to efficiently produce a matrix of DCT coefficients. The inverse quantization subprocess efficiently produces a matrix of DCT (Discrete Cosine Transform) coefficients. The present invention can take advantage of the SIMD architecture because it generates a vector containing the desired values which can then be processed in parallel. In the inverse quantization process of DV decoding, obtaining the quantization scale vectors is complex. One embodiment of the present invention utilizes 15 pre-defined quantization scales (a vector, also called herein an “array”) to dynamically build an 8×8 quantization matrix using one shift instruction for each row of the matrix. Therefore, one load instruction and seven shift instructions are needed for obtaining an 8×8 quantization matrix for an 8×8 pixel block.
The present invention utilizes a first look-up table (for 8×8 DCT mode) which produces a 15-valued array based on class number information, area number information and a quantization (QNO) number for an 8×8 data block (“data matrix” or “pixel block”) from the header information decoded from the encoded digital bitstream. The 8×8 data block is produced from a variable length decoding and inverse scan subprocess. An individual 8-valued segment of the 15-value array is multiplied by an individual 8-valued segment, e.g., “a row,” of the 8×8 data matrix to produce an individual row of the 8×8 matrix of DCT coefficients (“DCT matrix”). The above eight multiplications can be performed in parallel using a SIMD architecture to simultaneously generate the row of eight DCT coefficients. In this way, eight passes through the 8×8 data block are used to produce the entire 8×8 DCT matrix; in one embodiment this consumes only 33 instructions per 8×8 data block. After each pass, the 15-valued array is shifted by one value to update its quantization coefficients for proper alignment with its associated row of the data block. This continues until all rows of the data block are processed. The DCT matrix is then processed by an inverse discrete cosine transformation subprocess that generates decoded display data. A second lookup table can be used for 2×4×8 DCT mode processing.
One embodiment of the present invention is applied for the software DV decoder on a microprocessor with 128-bit registers and a multi-media instruction set. This instruction set includes an instruction to multiply 8 16-bit values from one register with 8 16-bit values from the other register to simultaneously produce 8 results and shifting two concatenating registers (256-bit) together for certain bytes. By using these media instructions and 128-bit wide bandwidth, not only are the execution cycles reduced by the present invention, but the memory access latency for the quantization matrix is also reduced to one access. In this implementation, 33 instructions are used to de-quantize one 8×8 block for both 8×8 DCT mode and for 2×4×8 DCT mode.
In an alternate embodiment of the present invention, a 64-bit SIMD architecture can also be used. Within the 64-bit SIMD instructions, two multiplication instructions can be applied for each row of the 8×8 matrix. Therefore, cycles spent on multiplication are doubled compared to the 128-bit SIMD embodiment. However, the generation of the quantization matrix is analogous to the 128-bit SIMD embodiment.
More specifically, embodiments of the present invention includes, in a digital DV decoding process, a meth

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Efficient de-quantization in a digital video decoding... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Efficient de-quantization in a digital video decoding..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Efficient de-quantization in a digital video decoding... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3005456

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.