Image analysis – Image compression or coding – Quantization
Reexamination Certificate
2001-06-08
2003-08-12
Wu, Jingge (Department: 2623)
Image analysis
Image compression or coding
Quantization
C382S236000, C375S240160
Reexamination Certificate
active
06606419
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to an image sequence coding and decoding method which performs interframe prediction using quantized values for chrominance or luminance intensity.
2. Description of Related Art
In high efficiency coding of image sequences, interframe prediction (motion compensation) by utilizing the similarity of adjacent frames over time, is known to be a highly effective technique for data compression. Today's most frequently used motion compensation method is block matching with half pixel accuracy, which is used in international standards H.263, MPEG1, and MPEG2. In this method, the image to be coded is segmented into blocks and the horizontal and vertical components of the motion vectors of these blocks are estimated as integral multiples of half the distance between adjacent pixels. This process is described using the following equation:
[Equation 1]
P
(
x,y
)=
R
(
x+u
i
,y+v
i
(
x,y
)∈
B
i
,0
≦i<N
(1)
where P(x, y) and R (x, y) denote the sample values (luminance or chrominance intensity) of pixels located at coordinates (x, y) in the predicted image P of the current frame and the reference image (decoded image of a frame which has been encoded before the current frame) R, respectively. x and y are integers, and it is assumed that all the pixels are located at points where the coordinate values are integers. Additionally it is assumed that the sample values of the pixels are quantized to non-negative integers. N, Bi, and (ui, vi) denote the number of blocks in the image, the set of pixels included in the i-th block of the image, and the motion vectors of the i-th block, respectively.
When the values for ui and vi are not integers, it is necessary to find the intensity value at the point where no pixels actually exist in the reference image. Currently, bilinear interpolation using the adjacent four pixels is the most frequently used method for this process. This interpolation method is described using the following equation:
[Equation 2]
⁢
R
(
x
+
p
d
,
y
+
q
d
=
⁢
(
(
d
-
q
)
⁢
(
(
d
-
p
)
⁢
R
⁡
(
x
,
y
)
+
pR
⁡
(
x
+
1
,
y
)
)
+
⁢
q
⁡
(
(
d
-
p
)
⁢
R
⁡
(
x
,
y
+
1
)
+
pR
⁡
(
x
+
1
,
y
+
1
)
)
)
//
d
2
(
2
)
where d is a positive integer, and p and q are smaller than d but not smaller than 0. “//” denotes integer division which rounds the result of normal division (division using real numbers) to the nearest integer.
An example of the structure of an H.263 video encoder is shown in FIG.
1
. As the coding algorithm, H.263 adopts a hybrid coding method (adaptive interframe/intraframe coding method) which is a combination of block matching and DCT (discrete cosine transform). A subtractor
102
calculates the difference between the input image (current frame base image)
101
and the output image
113
(related later) of the interframe/intraframe coding selector
119
, and then outputs an error image
103
. This error image is quantized in a quantizer
105
after being converted into DCT coefficients in a DCT converter
104
and then forms quantized DCT coefficients
106
. These quantized DCT coefficients are transmitted through the communication channel while at the same time used to synthesize the interframe predicted image in the encoder. The procedure for synthesizing the predicted image is explained next. The above mentioned quantized DCT coefficients
106
forms the reconstructed error image
110
(same as the reconstructed error image on the receive side) after passing through a dequantizer
108
and inverse DCT converter
109
. This reconstructed error image and the output image
113
of the interframe/intraframe coding selector
119
is added at the adder
111
and the decoded image
112
of the current frame (same image as the decoded image of current frame reconstructed on the receiver side) is obtained. This image is stored in a frame memory
114
and delayed for a time equal to the frame interval. Accordingly, at the current point, the frame memory
114
outputs the decoded image
115
of the previous frame. This decoded image of the previous frame and the original image
101
of the current frame are input to the block matching section
116
and block matching is performed between these images. In the block matching process, the original image of the current frame is segmented into multiple blocks, and the predicted image
117
of the current frame is synthesized by extracting the section most resembling these blocks from the decoded image of the previous frame. In this process, it is necessary to estimate the motion between the prior frame and the current frame for each block. The motion vector for each block estimated in the motion estimation process is transmitted to the receiver side as motion vector data
120
. On the receiver side, the same prediction image as on the transmitter side is synthesized using the motion vector information and the decoding image of the previous frame. The prediction image
117
is input along with a “0” signal
118
to the interframe/intraframe coding selector
119
. This switch
119
selects interframe coding or intraframe coding by selecting either of these inputs. Interframe coding is performed when the prediction image
117
is selected (this case is shown in FIG.
2
). On the other hand when the “0” signal is selected, intraframe coding is performed since the input image itself is converted, to a DCT coefficients and output to the communication channel. In order for the receiver side to correctly reconstruct the coded image, the receiver must be informed whether intraframe coding or interframe coding was performed on the transmitter side. Consequently, an identifier flag
121
is output to the communication circuit. Finally, an H.263 coded bitstream
123
is acquired by multiplexing the quantized DCT coefficients, motion vectors, the and interframe/intraframe identifier flag information in a multiplexer
122
.
The structure of a decoder
200
for receiving the coded bit stream output from the encoder of
FIG. 1
is shown in FIG.
2
. The H.263 coded bit stream
217
that is received is demultiplexed into quantized DCT coefficients
201
, motion vector data
202
, and a interframe/intraframe identifier flag
203
in the demultiplexer
216
. The quantized DCT coefficients
201
become a decoded error image
206
after being processed by an inverse quantizer
204
and inverse DCT converter
205
. This decoded error image is added to the output image
215
of the interframe/intraframe coding selector
214
in an adder
207
and the sum of these images is output as the decoded image
208
. The output of the interframe/intraframe coding selector is switched according to the interframe/intraframe identifier flag
203
. A prediction image
212
utilized when performing interframe encoding is synthesized in the prediction image synthesizer
211
. In this synthesizer, the position of the blocks in the decoded image
210
of the prior frame stored in frame memory
209
is shifted according to the motion vector data
202
. On the other hand, for intraframe coding, the interframe/intraframe coding selector outputs the “0” signal
213
as is.
SUMMARY OF THE INVENTION
The image encoded by H.263 is comprised of a luminance plane (Y plane) containing luminance information, and two chrominance planes (U plane and V plane) containing chrominance information. At this time, characteristically, when the image has 2 m pixels in the horizontal direction and 2 n pixels in the vertical direction (m and n are positive integers), the Y plane has 2 m pixels horizontally and 2 n pixels vertically, the U and V planes have m pixels horizontally and n pixels vertically. The low resolution on the chrominance plane is due to the fact that the human visual system has a comparatively dull visual faculty with respect to spatial variations in chrominance. Having such image as an input, H.263 performs coding and decoding in block units referred to as macroblocks. The struct
Antonelli Terry Stout & Kraus LLP
Hitachi , Ltd.
LandOfFree
Computer-readable medium having image decoding program... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Computer-readable medium having image decoding program..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Computer-readable medium having image decoding program... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3124413