Image analysis – Applications – Motion or velocity measuring
Reexamination Certificate
2001-03-16
2004-12-07
Mehta, Bhavesh M. (Department: 2625)
Image analysis
Applications
Motion or velocity measuring
C382S236000, C348S699000, C375S240010
Reexamination Certificate
active
06829373
ABSTRACT:
FILED OF THE INVENTION
The present invention relates to methods of encoding video sequences and in particular to a method of optimally setting the dimensions of the search window for a most efficient coding.
BACKGROUND OF THE INVENTION
The invention is useful in digital video coders where it is necessary to evaluate the activity of a block of information in the frequency domain.
Because of the particular importance of the of the widely applied MPEG standard in treating digitized video sequences, to illustrate a practical implementation of the method of the invention, a description of the method implemented within an MPEG2 coding will be presented. Obviously, the method of the invention remains perfectly valid and advantageously applicable even in decoders based on different standards (other than the MPEG), as they are defined from time to time.
Description of the MPEG2 Video Coding
The MPEG (Moving Pictures Experts Group) standard defines a set of algorithms dedicated to the compression of sequences of digitized pictures. These techniques are based on the reduction of the temporal, spatial and statistical redundance of the information constituting the sequence.
Reduction of spatial and statistical redundance is achieved by compressing independently the single images, by means of discrete cosine transform (DCT), quantization and variable length Huffman coding.
The reduction of temporal redundance is obtained using the correlation that exist between successive pictures of a sequence. Approximately, it may be said that each image can be expressed, locally, as a translation of a previous and/or subsequent image of the sequence. To this end, the MPEG standard uses three kinds of pictures, indicated with I (Intra Coded Frame), P (Predicted Frame) and B (Bidirectionally Predicted Frame). The I pictures are coded in a fully independent mode; the P pictures are coded in respect to a preceding I or P picture in the sequence; the B pictures are coded in respect to two pictures, of I or P kind: the preceding one and the following one in the video sequence.
A typical sequence of pictures can be the following one: I B B P B B P B B I B . . . This is the order in which they will be viewed, but given that any P is coded in respect to the previous I or P, and any B in respect to the preceding and following I or P, it is necessary that the decoder receive the P pictures before the B pictures, and the I pictures before the P pictures. Therefore the order of transmission of the pictures will be I P B B P B B I B B . . .
Pictures are elaborated by the coder sequentially, in the indicated order, and subsequently sent to a decoder which decodes and reorders them, allowing their subsequent displaying. To codify a B picture it is necessary for the coder to keep in a dedicated memory buffer, called “frame memory”, the I and P pictures, coded and thereafter decoded, to which a current B picture refers, thus requiring an appropriate memory capacity.
One of the most important functions in coding is motion estimation. Motion estimation is based on the following consideration: a set of pixels of a picture frame called current pixel set may be placed in a position of the subsequent and/or precedent picture obtained by rigid translation of the corresponding one to the current pixel set. Of course, these transpositions of objects may expose parts that were not visible before as well as changes of their shape (e.g. during a zooming, rotations and the like).
The family of algorithms suitable to identify and associate these portions of pictures is generally referred to as of “motion estimation”. Such association of pixels is instrumental to calculate the relative coordinates between the current portion and the portion identified as the best predictor, and to calculate the portion of picture difference, so removing redundant temporal information, thus making more effective the subsequent processes of DCT compression, quantization and entropic coding.
Such a method finds a typical example in the MPEG-2 standard. A typical block diagram of a video MPEG-2 coder is depicted in FIG. 
1
. Such a system includes the following functional blocks:
1) Chroma Filter Block from 4:2:2 to 4:2:0
In this block there is a low pass finite time response filter operating on the chrominance component, which allows the substitution of any pixel with the weighed sum of neighboring pixels placed on the same column and multiplied by appropriate coefficients. This allows a subsequent subsampling by two, thus obtaining a halved vertical definition of the chrominance.
2) Frame Ordinator
This blocks includes one or several frame memories outputting the frames in the coding order required by the MPEG standard. For example, if the input sequence is I B B P B B P etc., the output order will be I P B B P B B . . . .
I (Intra coded picture) is a frame or a half-frame containing temporal redundance;
P (Predicted-picture) is a frame or a half-frame whose temporal redundance in respect to the preceding I or P (previously co/decoded) has been removed;
B (Bidirectionally predicted-picture) is a frame or a half-frame whose temporal redundance with respect to the preceding I and subsequent P (or preceding P and subsequent P, or preceding P and subsequent I) has been removed (in both cases the I and P pictures must be considered as already co/decoded).
Each frame buffer in the format 4:2:0 occupies the following memory amount:
Standard PAL
 720×576×8 for the luminance (
Y
)=3,317,760 bits
360×288×8 for the chrominance (
U
)=829,440 bits
360×288×8 for the chrominance (
V
)=829,440 bits
total 
Y+U+V
=4,976,640 bits
Standard NTSC
720×480×8 for the luminance (
Y
)=2,764,800 bits
360×240×8 for the chrominance (
U
)=691,200 bits
360×240×8 for the chrominance (
V
)=691,200 bits
total 
Y+U+V
=4,147,200 bits 
3) Estimator
This block is able to remove the temporal redundance from the P and B pictures.
4) DCT
This is the block that implements the discrete cosine transform according to the MPEG-2 standard. The I picture and the error pictures P and B are divided in blocks of 8*8 pixels Y, U, V, on which the DCT transform is performed.
5) Quantizer Q
An 8*8 block resulting from the DCT transform is then divided by a so-called quantizing matrix (in particular to divide the cosine transformed matrix of the macroblock by the matrix mQuant*Quantizer_Matrix where Quantizer_Matrix is a priori established and can vary from picture to picture) to reduce more or less drastically the bit number magnitude of the DCT coefficients. In such case, the information associated to the highest frequencies, less visible to human sight, tends to be removed. The result is reordered and sent to the subsequent block.
6) Variable Length Coding (VLC)
The codification words output from the quantizer tend to contain null coefficients in a more or less large number, followed by nonnull values. The null values preceding the first nonnull value are counted and the count figure constitutes the first portion of a codification word, the second portion of which represents the nonnull coefficient.
These pairs tend to assume values more probable than others. The most probable ones are coded with relatively short words (composed of 2, 3 or 4 bits) while the least probable are coded with longer words. Statistically, the number of output bits is less than when such a criterion is not implemented.
7) Multiplexer and Buffer
Data generated by the variable length coder for each macroblock, the motion vectors, the kind of macroblock I/P/B, the mQuant values, the quantizing matrices of each picture and other syntactic elements are assembled for constructing the serial bitstream whose final syntax is fully defined by the MPEG-2 video section standard. The resulting bitstream is stored in a memory buffer, the limit size of which is defined by the MPEG-2 standard requisite that the buffer cannot be overflown, otherwise a loss of information useful in decoding would o
Pau Danilo
Piccinelli Emiliano
Rovati Fabrizio
Allen Dyer Doppelt Milbrath & Gilchrist, P.A.
Azarian Seyed
Jorgenson Lisa K.
Mehta Bhavesh M.
STMicroelectronics S.r.l.
LandOfFree
Automatic setting of optimal search window dimensions for... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Automatic setting of optimal search window dimensions for..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Automatic setting of optimal search window dimensions for... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3325339