Image analysis – Applications – Motion or velocity measuring
Reexamination Certificate
2000-03-17
2002-11-19
Zimmerman, Mark (Department: 2671)
Image analysis
Applications
Motion or velocity measuring
C382S232000
Reexamination Certificate
active
06483928
ABSTRACT:
FIELD OF THE INVENTION
The present invention relates to digital processing systems of video images, and in particular, to systems for decoding sequences of compressed pictures by motion prediction and motion compensation algorithms, and to a method of motion estimation.
BACKGROUND OF THE INVENTION
The spatio-temporal recursive estimation was disclosed for the first time in the article G. de Haan, P. W. A. C. Biezen, H. Huijgen, O. A. Ojo, “True motion estimation with 3-D recursive search block matching”, IEEE Trans. Circuits and Systems for Video Technology, Vol. 3, October 1993, pp. 368-379. Instead of carrying out several comparisons (matching errors), the vectors that guarantee the best spatial coherence of the vector field are singled out among vectors associated to the neighboring macroblocks of the current macroblock; such vectors are called “spatial predictors”. Even “temporal predictors” can be used, that is vectors calculated for pairs of preceding pictures, in order to have a temporal coherent vector field.
Following this approach, a new algorithm has been realized for H.263 applications, described in European Patent Applications 97402763.3 and 98200461.6.
Basic Principles
In motion estimation algorithms of the block-matching kind, a displacement vector (often unproperly called motion vector) d(b
c
, t) is assigned to the middle
b
c
=(
x
c
, y
c
)
tr
of a block of pixels B(b
c
), of the current picture I(x,t), where tr stands for “transposed”. This assignment is carried out if B(b
c
) has a good correlation, or matching, with a similar block inside a search area SA(b
c
), also centered on b
c
, but belonging to the preceding picture, I(x,t−T), where T is time interval between the two pictures being coded, which corresponds to an integer multiple n of the period Tq=40 ms for PAL sequences or 30 ms for NTSC sequences. The center of such a similar block will be shifted in respect to b
c
by the motion vector d(b
c
, t).
A number of candidate vectors C are tested, measuring an error e(C, b
c
, t) to quantify similarities among the considered blocks, in order to find d(b
c
, t). A diagram illustrating such a procedure is depicted in FIG.
1
.
Pixels inside the B(b
c
) block have the following positions:
(
x
c
−X/
2≦
x≦x
c
+x
2)
(
y
c
Y/
2
≦y≦y
c
+Y/
2)
where X is the block length and Y is the block height (both equal to 16 for MPEG-1, MPEG-2 and H.263) and x=(x,y)
tr
is its spatial position on the picture.
Candidate vectors are selected within the set of candidates CS(b
c
, t) determined by:
CS
⁡
(
b
c
,
t
)
=
{
(
d
⁡
(
b
c
-
(
X
Y
)
,
t
)
+
U
1
⁡
(
b
c
)
,
(
d
⁡
(
b
c
-
(
-
X
Y
)
,
t
)
+
U
2
⁡
(
b
c
)
,
d
⁡
(
b
c
-
(
0
-
2
⁢
Y
)
,
t
-
T
)
)
}
(
1
)
where the update vectors U
1
(b
c
) e U
2
(b
c
) are randomly selected from a set of updates US, defined by:
US
(
b
c
)=
US
i
(
b
c
)
∪Us
f
(
b
c
)
where integer updates USi (b
c
) are given by:
US
f
⁡
(
b
c
)
=
{
(
0
0
)
,
⁢
(
0
1
)
,
(
0
-
1
)
,
(
1
0
)
,
(
-
1
0
)
,
(
0
2
)
,
(
0
-
2
)
,
(
2
0
)
,
(
-
2
0
)
,
(
0
3
)
,
(
0
-
3
)
,
(
3
0
)
,
(
-
3
0
)
}
(
2
)
Update vectors US
f
(b
c
), necessary to realize a ½ pixel accuracy, are given by:
US
f
⁡
(
b
c
)
=
{
(
0
1
2
)
,
(
0
-
1
2
)
,
(
1
2
0
)
,
(
-
1
2
0
)
}
(
3
)
Both U
1
(b
c
) and U
2
(b
c
) contain the zero update vector
(
0
0
)
.
From these equations it may be said that the set of candidates is constituted by temporal and spatial predictors, taken from a 3-D set and from an update prediction vector. This procedure implicitly makes the vector field spatially and temporally consistent. The updating process includes adding updates from time to time to one of the two spatial predictors. The positions of the spatial predictors and spatio-temporal predictors in respect to the current block are show in FIG.
2
.
The displacement vector d(b
c
, t) resulting from the block-matching process, is a candidate vector C that produces the lowest value of the following error function e(C,b
c
, t):
d
(
b
c
,t
)={
C&egr;CS|e
(
C,b
&ugr;
t
)≦
e
(
V,b
&ugr;
t
)) ∀(
V&egr;CS
(
b
c
,t
))} (4)
The error function is based on the difference of luminance values of the current block of the current picture I(x,t) and of those of the block shifted from the picture I(x−C, t−T), summed over the block B(b
c
). A typical choice, which is also adopted here, is that of the sum of the absolute differences (SAD). The error function is then given by:
e
⁡
(
C
,
b
c
,
t
)
=
SAD
⁡
(
C
,
b
c
,
t
)
=
∑
x
∈
B
⁡
(
b
c
)
⁢
⁢
&LeftBracketingBar;
I
⁡
(
x
,
t
)
-
I
⁡
(
x
-
C
,
t
-
T
)
&RightBracketingBar;
(
5
)
It could be possible to adapt the updates distribution to the calculated errors, as done in SLIMPEG, such that the lower the errors are the more concentrated the distribution is, while the greater the errors are the more dispersed the distribution is. Nevertheless such a distribution strongly depends on the sequences used to calculate them. In the H.263, the bit-rate and above all the frame-rate is strongly variable (i.e. the frame rate in MPEG-2 is constant, while in H.263 is not) therefore it would be very difficult to define appropriate “training” sequences.
Iterative Estimation
In order to further improve the consistency of the vector field, the motion estimation is iterated m times (m=4 at most) on the same pair of pictures, by using the vectors calculated in the preceding iteration as temporal predictors of the current iteration. During the first and the third iteration, pictures are scanned in video scan-raster mode, that is from top to bottom and from left to right (as they are usually displayed on a TV screen). On the contrary, during the second and fourth iterations, pictures are scanned in the opposite direction, from bottom to top and from right to left. This is made possible by storing the pictures in a SDRAM (in fact if they were stored in a FIFO, only the scan-raster scanning would be possible).
Candidates vectors are selected from a new set of candidates CS′ (b
c
, t), which is defined by:
CS
′
⁡
(
b
c
,
t
)
=
{
(
d
⁡
(
b
c
-
(
X
(
-
1
)
i
+
1
⁢
Y
)
,
t
)
+
U
1
⁡
(
b
c
)
,
(
d
⁡
(
b
c
-
(
-
X
(
-
1
)
i
+
1
⁢
Y
)
,
t
)
+
U
2
⁡
(
b
c
)
,
(
d
⁡
(
b
c
-
(
0
(
-
1
)
i
⁢
2
⁢
Y
)
,
t
-
T
/
i
)
}
where i is the number of the current iteration and d(b
c
, t−T/i) is the d in the preceding iteration (i−1), or, for each first iteration on a new pair of pictures, the last iteration on the preceding pair. The computing complexity of this estimator is practically constant when the frame rate changes, in fact the number of iterations for each pair of pictures varies according to the length of the time interval among said pictures. The case n
3
≧5 is not considered because the controller of the bit-rate is supposed to be unlikely to decide jumping by 4 consecutive pictures (but it could happen).
Macroblocks Undersampling
The computing complexity of the motion estimation could be decreased, by halving the number of macroblocks on which vectors are calculated with the technique described in the article G. de Haan, P. W. A. C. Biezen, “Sub-pixel motion estimation with 3-D recursive block-matching”, Signal Processing: Image Communication 6 (1995), pp. 485-498. The grid of undersampled macroblocks has typically a quincunx shape pattern (as depicted in FIG.
3
).
If the vector d
m
=d(b
c
, t) is missing, it can be calculated from horizontally adjacent vectors d
a
, according to the following table:
Time interval T = nTq
Skipped Images
iterations
n = 1
0
0
n = 2
1
2
n = 3
2
3
n = 4
3
4
Dm=median (
d
l
,
d
r
,
d
av
) (6)
where
d
l
=
d
a
⁡
(
b
c
-
(
X
0
)
,
t
)
d
r
=
d
a
⁡
(
b
c
+
(
X
0
)
,
t
)
d
av
={fraction (
1
/
2
)}(
d
t
+
d
b
)
and
d
t
=
d
a
⁡
(
b
c
-
(
X
0
)
,
t
)
d
b
=
d
a
⁡
(
b
c
-
(
X
0
)
,
t
)
The median interpolating filter acts separately on both
Bagni Daniele
Giovenzana Davide
Pezzoni Luca
Allen Dyer Doppelt Milbrath & Gilchrist, P.A.
Cao Huedung X.
Jorgenson Lisa K.
STMicroelectronics S.r.l.
Zimmerman Mark
LandOfFree
Spatio-temporal recursive motion estimation with 1/2... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Spatio-temporal recursive motion estimation with 1/2..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Spatio-temporal recursive motion estimation with 1/2... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2920425