Predictive encoding and decoding methods of video data

Pulse or digital communications – Bandwidth reduction or expansion – Television or motion video signal

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C375S240000, C375S240160, C375S240170, C375S240130, C375S240150

Reexamination Certificate

active

06785331

ABSTRACT:

TECHNICAL FIELD
The present invention relates to methods for encoding and decoding signals of video data (i.e., moving pictures).
BACKGROUND ART
In existing video data coding standards such as ITU-T H.261, H. 263, ISO/IEC 11172-2 (MPEG-1), and ISO/IEC 13818-2 (MPEG-2), a motion-compensated interframe prediction method is adopted for reducing temporal redundancy with respect to video data. Also in an example model based on the ISO/IEC14496-2 (MPEG-4) standard which is currently being studied, a similar motion compensating method is adopted.
Generally in motion-compensated predictive coding methods, (i) a frame to be encoded (i.e., the current frame) is divided into rectangular blocks, called “macroblocks”, having 16 pixels×16 lines, (ii) a relative amount of the motion (i.e., a motion vector having horizontal component t
x
and vertical component t
y
of displacement) with respect to a reference frame is detected for each macroblock, and (iii) an interframe difference between a predicted frame and the current frame is encoded, where the predicted frame is obtained in a manner such that the block of the reference frame corresponding to the relevant macroblock of the current frame is shifted by the motion vector.
More specifically, predicted image data (in the reference frame) which most matches the image data at point (x, y) of the current frame is represented by using coordinates (x′, y′) and the above motion vector (t
x
, t
y
) as follows.

x′=x+t
x
y′=y+t
y
That is, the pixel value at the same point (x, y) of the reference frame is not directly used, but the pixel value at a point obtained by shifting the point (x, y) by the motion vector (t
x
, t
y
) is determined as the predicted value, thereby remarkably improving the efficiency of the interframe prediction.
On the other hand, a global motion compensation method has been proposed, in which motions of the whole picture caused by a camera motion such as panning, tilting, or zooming are predicted (refer to H. Jozawa, et al., “Core Experiment on Global Motion Compensation (P
1
) Version 5.0”, Description of Core Experiments on Efficient Coding in MPEG-4 Video, pp. 1-17, December, 1996). Below, the general structure and operation flow of the encoder and decoder used for the global motion compensation will be explained with reference to
FIGS. 3 and 4
.
First, frame (data)
1
to be encoded (i.e., input frame
1
) and reference frame (data)
3
are input into global motion estimator
4
, where global motion parameters
5
relating to the whole frame are determined. Projective transformations, bilinear transformations, or affine transformations can be used as a motion model in this system. The method disclosed by Jozawa et al. can be applied to any motion model so that the kind of motion model is unlimited; however, the general functions of the representative motion models as described above will be explained below.
With any point (x, y) of the current frame and corresponding predicted point (x′, y′) of the reference frame, the projective transformation is represented by the following formula.
x
′=(
ax+by+t
x
)/(
px+qy+s
)
y
′=(
cx+dy+t
y
)/(
px+qy+s
)  (1)
where a, b, c, d, p, q, and s are constants. The projective transformation is a basic form of the two-dimensional transformation, and generally, the case s=1 in formula (1) is called the projective transformation. If p=q=0 and s=1, then the formula represents the affine transformation.
The following is the formula representing the bilinear transformation.
x′=gxy+ax+by+t
x
y′=hxy+cx+dy+t
y
  (2)
where a, b, c, d, g, and h are constants. If g=h=0 in this formula, then the affine transformation can also be obtained as the following formula (3).
x′=ax+by+t
x
y′=cx+dy+t
y
  (3)
In the above formulas, t
x
and t
y
respectively represent the amounts of parallel shifting motions in the horizontal and vertical directions. Parameter “a” represents an extension/contraction or inversion effect in the horizontal direction, while parameter “d” represents an extension/contraction or inversion effect in the vertical direction. Parameter “b” represents a shearing effect in the horizontal direction, while parameter “c” represents a shearing effect in the vertical direction. In addition, the condition that a=cos&thgr;, b=sin&thgr;, c=−sin&thgr;, and d=cos&thgr; represents rotation by angle &thgr;. The condition that a=d=1 and b=c=0 represents a model equal to a conventional parallel motion model.
As explained above, the motion model employing the affine transformation can represent various motions such as parallel shift, extension/contraction, inversion, shear and rotation and any composite motions consisting of a few kinds of the above motions. Projective or bilinear transformations having many more parameters can represent more complicated motions.
The global motion parameters
5
determined in the global motion estimator
4
are input into global motion compensated predictor
6
together with reference frame
3
stored in frame memory
2
. The global motion compensated predictor
6
makes the motion vector (for each pixel) calculated using the global motion parameters
5
act on the reference frame
3
, so as to generate global motion-compensating predicted frame (data)
7
.
On the other hand, the reference frame
3
stored in the frame memory
2
is input into local motion estimator
8
together with input frame
1
. In the local motion estimator
8
, motion vector
9
between the input frame
1
and the reference frame
3
is detected for each macroblock of 16 pixels×16 lines. In the local motion compensated predictor
10
, local motion-compensating predicted frame (data)
11
is generated using the motion vector
9
of each macroblock and the reference frame
3
. The above operation corresponds to the conventional motion compensation method used in MPEG or the like.
Next, the prediction mode determining section
12
chooses one of the global motion-compensating predicted frame
7
and the local motion-compensating predicted frame
11
for each macroblock, the chosen one having a smaller error with respect to the input frame
1
. The predicted frame
13
chosen by the prediction mode determining section
12
is input into subtracter
14
, and a difference frame
15
between the input frame
1
and the predicted frame
13
is converted into DCT coefficients
17
in DCT (discrete cosine transform) section
16
. Each DCT coefficient
17
obtained by the DCT section
16
is further converted into quantized index
19
in quantizer
18
. The quantized index
19
, global motion parameters
5
, motion vector
9
, and prediction mode information
26
showing the determined prediction mode output from the prediction mode determining section
12
are respectively encoded in encoding sections
101
to
104
, and then multiplexed in the multiplexer
27
′ so as to generate encoder output (i.e., encoded bit sequence)
28
′.
In order to make the reference frames in both the encoder and decoder agree with each other, the quantized index
19
is restored to quantization representative value
21
by inverse quantizer
20
, and then inversely converted into difference frame
23
by inverse DCT section
22
. The difference frame
23
and the predicted frame
13
are added in adder
24
, so that locally decoded frame
25
is obtained. This locally decoded frame
25
is stored in frame memory
2
, and is used as a reference frame when the next frame is encoded.
In the decoder (see FIG.
4
), the encoded bit sequence
28
′ which was received is separated using demultiplexer
29
′ into four encoded components, that is, quantized index
19
, prediction mode information
26
, motion vector
9
, and global motion parame

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Predictive encoding and decoding methods of video data does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Predictive encoding and decoding methods of video data, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Predictive encoding and decoding methods of video data will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3341633

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.