Image analysis – Image compression or coding – Interframe coding
Reexamination Certificate
1999-10-12
2003-04-15
Do, Anh Hong (Department: 2624)
Image analysis
Image compression or coding
Interframe coding
C382S232000, C375S240160
Reexamination Certificate
active
06549668
ABSTRACT:
FIELD OF THE INVENTION
The present invention relates to transmission of digitized pictures, and, more particularly, to a picture compression technique based on motion estimation algorithms, such as those implemented in MPEG2 video encoders.
BACKGROUND OF THE INVENTION
World TV standards differ from one another in picture size and the way pictures are transmitted, e.g., frame rate, frequency band, etc. Usually, a motion picture is recorded by a camera at 24 photograms/second. In order for the film to be transmitted with the PAL or SECAM system, which requires a speed of 25 pictures/second, the film is slightly accelerated. This results in minor sound and motion distortions which are not perceived by a TV viewer. However, when the film must be transmitted with the NTSC TV standard, which requires a frame rate of 30 pictures/second, such an acceleration might produce distortions that would be perceived by the TV viewer. In this case, it is necessary to implement a 3:2 pulldown technique which transforms a film recorded at 24 photograms/second to a TV sequence of 30 pictures/second.
As shown in
FIG. 1
, a motion picture is recorded by a common photogram, that is, each picture is acquired as a whole in one instant. In contrast, a television picture is acquired in two distinct instants. A first instant is acquisition of the even lines, which makes up the even semifield (top field) of the picture. A second instant is acquisition of the odd lines, which makes up the odd semifield (bottom field). The sum of these two semifields make up the whole picture, which is also referred to as a video frame. In view of the fact that some time passes between the acquisition of the first and second semifields, there may be relative movements among the objects focused by the video camera. Consequently, an object may occupy slightly different positions in the two fields.
Even a photogram may be divided in two fields. The two fields are the even lines forming the top field and the odd lines forming the bottom field. Since the picture is acquired in a single instant, the focused objects will occupy the same position in both fields. The 3:2 pulldown conversion method from 24 to 30 photograms/second includes transforming a sequence of 4 photograms of the film in a sequence of 5 TV frames through the duplication of some fields according to the scheme illustrated in FIG.
2
. In
FIG. 2
, Top i and Bot i indicate the top field (even lines) and the bottom field (odd lines), respectively, of the photogram i, where i=1,2,3,4. This repetition of fields causes artifacts that do not disturb the TV viewer because the frames repeat themselves at 33 ms intervals.
Since the fundamental of picture compression methods is to reduce the amount of information that must be transmitted or recorded, the encoding of repetitive fields could be avoided by coding 5 pictures with the same number of bits that would be required by 4 pictures upon detecting whether the sequence contains a 3:2 pulldown transformation. Hence, it will be a function of the decoder (the receiving station) to reconstruct the sequence of 5 pictures according to the above described scheme. For the same quality of the pictures the compression ratio may be increased, or for the same number of bits available for coding the pictures it is possible to improve their quality.
A reliable detection of the 3:2 pulldown may also permit a further improvement of the coding quality for compression algorithms that use motion estimators based on the correlation of generated motion fields. However, the field repetition according to a 3:2 pulldown scheme may cause inconsistencies in the global motion of the pictures. When a field is repeated, the motion estimator senses this as a stop of the motion (even if fictitious) of the picture's objects, thus verifying the convergence process of the generated vectors.
A further beneficial consequence of the detection of the 3:2 pulldown is that its detection will render useless all the predictions permitted by the encoding method, e.g., field prediction, dual prime prediction and frame prediction, etc. Detection of the 3:2 pulldown is also an advantage for the methods that take into consideration the lack of movement among the picture fields, such as the frame prediction methods.
To detect the 3:2 pulldown, an analysis of the motion fields generated by the coding system for an acceptable motion estimation among the fields becomes essential. The basic concept of motion estimation is the following. A set of pixels of a field of a picture may be placed in a position of the subsequent picture obtained by translating the preceding one. These transpositions of objects may expose to the video camera parts that were not visible before, as well as changes of their shape, e.g., zooming.
The family of algorithms suitable to identify and associate these portions of images is generally referred to as motion estimation. Such an association permits calculation of a difference image. This removes the redundant temporal information making more effective the subsequent process of compression by discrete cosine transform (DCT), quantization and entropic coding.
Such a method is found in the MPEG2 standard. Systems of motion estimation as well as the architecture of the present invention are equally useful and are readily applicable to systems for manipulating digitized pictures operating according to a standard different from the MPEG2 standard. A typical block diagram of a video MPEG2 decoder is depicted in FIG.
3
. Such a system is made up of the following functional blocks.
Field ordinator. This block is composed of one or several field memories outputting the fields in the coding order required by the MPEG standard. For example, if the input sequence is I B B P B B P etc., the output order will be I P B B P B B . . . .
The intra coded picture I is a field and/or a semifield containing temporal redundancy. The predicted-picture P is a field and/or semifield from which the temporal redundancy with respect to the preceding I or P (previously co-decoded) picture has been removed. The bidirectionally predicted picture B is a field and/or a semifield whose temporal redundancy with respect to the preceding I and subsequent P (or preceding P and successive P) picture field has been removed. In both cases, the I and P pictures must be considered as already co/decoded.
Each frame buffer in the format 4:2:0 occupies the following memory space:
standard PAL
720 × 576 × 8 for the luminance (Y) =
3,317,760 bits
360 × 288 × 8 for the chrominance (U) =
829,440 bits
360 × 288 × 8 for the chrominance (V) =
829,440 bits
total Y + U + V =
4,976,640 bits
standard NTSC
720 × 480 × 8 for the luminance (Y) =
2,764,800 bits
360 × 240 × 8 for the chrominance (U) =
691,200 bits
360 × 240 × 8 for the chrominance (V) =
691,200 bits
total Y + U + V =
4,147,200 bits
Motion Estimator. This block removes the temporal redundancy from the P and B pictures.
DCT. This block implements the cosine-discrete transform according to the MPEG-2 standard. The I picture and the error pictures P and B are divided in 8*8 blocks of pixels Y, U, V on which the DCT transform is performed.
Quantizer Q. An 8*8 block resulting from the DCT transform is divided by a quantizing matrix to reduce the magnitude of the DCT coefficients. The information associated to the highest frequencies, less visible to human sight, tends to be removed. The result is reordered and sent to the successive block.
Variable Length Coding (VLC). The codification words output from the quantizer tend to contain a large number of null coefficients, followed by non-null values. The null values preceding the first non-null value are counted and the count figure forms the first portion of a codification word. The second portion of which represents the non-null coefficient.
These paired values tend to assume values mor
Pau Danilo
Pezzoni Luca
Piccinelli Emiliano
Allen Dyer Doppelt Milbrath & Gilchrist, P.A.
Jorgenson Lisa K.
STMicroelectronics S.r.l.
LandOfFree
Detection of a 3:2 pulldown in a motion estimation phase and... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Detection of a 3:2 pulldown in a motion estimation phase and..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Detection of a 3:2 pulldown in a motion estimation phase and... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3084843