Image analysis – Image compression or coding – Interframe coding
Reexamination Certificate
2001-06-08
2003-06-24
Wu, Jingge (Department: 2623)
Image analysis
Image compression or coding
Interframe coding
C382S233000, C382S238000, C382S251000
Reexamination Certificate
active
06584227
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to an image sequence coding and decoding method which performs interframe prediction using quantized values for chrominance or luminance intensity.
2. Description of Related Art
In high efficiency coding of image sequences, interframe prediction (motion compensation) by utilizing the similarity of adjacent frames over time, is known to be a highly effective technique for data compression. Today's most frequently used motion compensation method is block matching with half pixel accuracy, which is used in international standards H.263, MPEG1, and MPEG2. In this method, the image to be coded is segmented into blocks and the horizontal and vertical components of the motion vectors of these blocks are estimated as integral multiples of half the distance between adjacent pixels. This process is described using the following equation:
[Equation 1]
P
⁡
(
x
,
y
)
=
R
(
x
+
u
i
,
y
+
v
i
⁢
 
⁢
(
x
,
y
)
∈
B
i
,
0
≤
i
<
N
(
1
)
where P(x, y) and R(x, y) denote the sample values (luminance or chrominance intensity) of pixels located at coordinates (x, y) in the predicted image P of the current frame and the reference image (decoded image of a frame which has been encoded before the current frame) R, respectively. x and y are integers, and it is assumed that all the pixels are located at points where the coordinate values are integers. Additionally it is assumed that the sample values of the pixels are quantized to non-negative integers. N, Bi, and (ui, vi) denote the number of blocks in the image, the set of pixels included in the i-th block of the image, and the motion vectors of the i-th block, respectively.
When the values for ui and vi are not integers, it is necessary to find the intensity value at the point where no pixels actually exist in the reference image. Currently, bilinear interpolation using the adjacent four pixels is the most frequently used method for this process. This interpolation method is described using the following equation:
[Equation 2]
 
⁢
R
(
x
+
p
d
,
y
+
q
d
=
(
(
d
-
q
)
⁢
(
(
d
-
p
)
⁢
R
⁢
(
x
,
y
)
+
pR
⁢
(
x
+
1
,
y
)
)
+
q
⁢
(
(
d
-
p
)
⁢
R
⁢
(
x
,
y
+
1
)
+
pR
⁢
(
x
+
1
,
y
+
1
)
)
//
d
2
(
2
)
where d is a positive integer, and p and q are smaller than d but not smaller than 0. “//” denotes integer division which rounds the result of normal division (division using real numbers) to the nearest integer.
An example of the structure of an H.263 video encoder is shown in FIG. 
1
. As the coding algorithm, H.263 adopts a hybrid coding method (adaptive interframe/intraframe coding method) which is a combination of block matching and DCT (discrete cosine transform). A subtractor 
102
 calculates the difference between the input image (current frame base image) 
101
 and the output image 
113
 (related later) of the interframe/intraframe coding selector 
119
, and then outputs an error image 
103
. This error image is quantized in a quantizer 
105
 after being converted into DCT coefficients in a DCT converter 
104
 and then forms quantized DCT coefficients 
106
. These quantized DCT coefficients are transmitted through the communication channel while at the same time used to synthesize the interframe predicted image in the encoder. The procedure for synthesizing the predicted image is explained next. The above mentioned quantized DCT coefficients 
106
 forms the reconstructed error image 
110
 (same as the reconstructed error image on the receive side) after passing through a dequantizer 
108
 and inverse DCT converter 
109
. This reconstructed error image and the output image 
113
 of the interframe/intraframe coding selector 
119
 is added at the adder 
111
 and the decoded image 
112
 of the current frame (same image as the decoded image of current frame reconstructed on the receiver side) is obtained. This image is stored in a frame memory 
114
 and delayed for a time equal to the frame interval. Accordingly, at the current point, the frame memory 
114
 outputs the decoded image 
115
 of the previous frame. This decoded image of the previous frame and the original image 
101
 of the current frame are input to the block matching section 
116
 and block matching is performed between these images. In the block matching process, the original image of the current frame is segmented into multiple blocks, and the predicted image 
117
 of the current frame is synthesized by extracting the section most resembling these blocks from the decoded image of the previous frame. In this process, it is necessary to estimate the motion between the prior frame and the current frame for each block. The motion vector for each block estimated in the motion estimation process is transmitted to the receiver side as motion vector data 
120
. On the receiver side, the same prediction image as on the transmitter side is synthesized using the motion vector information and the decoding image of the previous frame. The prediction image 
117
 is input along with a “0” signal 
118
 to the interframe/intraframe coding selector 
119
. This switch 
119
 selects interframe coding or intraframe coding by selecting either of these inputs. Interframe coding is performed when the prediction image 
117
 is selected (this case is shown in FIG. 
2
). On the other hand when the “0” signal is selected, intraframe coding is performed since the input image itself is converted, to a DCT coefficients and output to the communication channel. In order for the receiver side to correctly reconstruct the coded image, the reciever must be informed whether intraframe coding or interframe coding was performed on the transmitter side. Consequently, an identifier flag 
121
 is output to the communication circuit. Finally, an H.263 coded bitstream 
123
 is acquired by multiplexing the quantized DCT coefficients, motion vectors, the and interframe/intraframe identifier flag information in a multiplexer 
122
.
The structure of a decoder 
200
 for receiving the coded bit stream output from the encoder of 
FIG. 1
 is shown in FIG. 
2
. The H.263 coded bit stream 
217
 that is received is demultiplexed into quantized DCT coefficients 
201
, motion vector data 
202
, and a interframe/intraframe identifier flag 
203
 in the demultiplexer 
216
. The quantized DCT coefficients 
201
 become a decoded error image 
206
 after being processed by an inverse quantizer 
204
 and inverse DCT converter 
205
. This decoded error image is added to the output image 
215
 of the interframe/intraframe coding selector 
214
 in an adder 
207
 and the sum of these images is output as the decoded image 
208
. The output of the interframe/intraframe coding selector is switched according to the interframe/intraframe identifier flag 
203
. A prediction image 
212
 utilized when performing interframe encoding is synthesized in the prediction image synthesizer 
211
. In this synthesizer, the position of the blocks in the decoded image 
210
 of the prior frame stored in frame memory 
209
 is shifted according to the motion vector data 
202
. On the other hand, for intraframe coding, the interframe/intraframe coding selector outputs the “0” signal 
213
 as is.
SUMMARY OF THE INVENTION
The image encoded by H.263 is comprised of a luminance plane (Y plane) containing luminance information, and two chrominance planes (U plane and V plane) containing chrominance information. At this time, characteristically, when the image has 2m pixels in the horizontal direction and 2n pixels in the vertical direction (m and n are positive integers), the Y plane has 2m pixels horizontally and 2n pixels vertically, the U and V planes have m pixels horizontally and n pixels vertically. The low resolution on the chrominance plane is due to the fact that the human visual system has a comparatively dull visual faculty with respect to spatial variations in chrominance. Having such image as an input, H. 263 performs coding and decoding in block units referred to as macroblocks. The structure of a macroblo
LandOfFree
Computer-readable medium having image decoding program... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Computer-readable medium having image decoding program..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Computer-readable medium having image decoding program... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3158916