Pulse or digital communications – Bandwidth reduction or expansion – Television or motion video signal
Reexamination Certificate
1999-01-27
2002-11-19
Le, Vu (Department: 2613)
Pulse or digital communications
Bandwidth reduction or expansion
Television or motion video signal
C375S240090, C382S243000
Reexamination Certificate
active
06483874
ABSTRACT:
BACKGROUND OF THE INVENTION
The present invention provides efficient motion estimation for an arbitrarily-shaped object for use in an object-based digital video coding system.
Object manipulation is one of the desirable features for multimedia applications. This functionality is available in the developing digital video compression standards, such as H.263+ and MPEG-4. For H.263+, refer to ITU-T Study Group 16, Contribution 999, Draft Text of Recommendation H.263 Version 2 (“H.263+”) for Decision, Sep. 1997, incorporated herein by reference. For MPEG-4, refer to ISO/IEC 14496-2 Committee Draft (MPEG-4), “Information Technology—Coding of audio-visual objects: visual,” JTC1/SC29/WG11 N2202 March 1998, incorporated herein by reference.
MPEG-4 uses a shape coding tool to process an arbitrarily shaped object known as a Video Object Plane (VOP). With shape coding, shape information, referred to as alpha planes, is obtained. Binary alpha planes are encoded by modified Content-based Arithmetic Encoding (CAE), while grey-scale alpha planes are encoded by a motion compensated Discrete Cosine Transform (DCT), similar to texture coding. An alpha plane is bounded by a rectangle that includes the shape of the VOP (Intelligent VOP formation). The bounding rectangle of the VOP is extended on the right-bottom side to multiples of 16×16 blocks, and the extended alpha samples are set to zero. The extended alpha plane is partitioned into blocks of 16×16 samples (e.g., alpha blocks) and the encoding/decoding process is performed on each alpha block.
Moreover, compression of digital video objects is important in view of the bandwidth-limited channels over which such data is communicated. In particular, motion compensation is the most popular tool to reduce the temporal redundancy in video compression.
Motion estimation and motion compensation (ME/MC) generally involve matching a block of a current video frame (e.g., a current block) with a block in a search area of a reference frame (e.g., a predicted block or reference block). For predictive (P) coded images, the reference block is in a previous frame. For bi-directionally predicted (B) coded images, predicted blocks in previous and subsequent frames may be used. The displacement of the predicted block relative to the current block is the motion vector (MV), which has horizontal (x) and vertical (y) components. Positive values of the MV components indicate that the predicted block is to the right of, and below, the current block.
A motion compensated difference block is formed by subtracting the pixel values of the predicted block from those of the current block point by point. Texture coding is then performed on the difference block. The coded MV and the coded texture information of the difference block are transmitted to the decoder. The decoder can then reconstruct an approximated current block by adding the quantized difference block to the predicted block according to the MV.
Efficiency of motion compensation depends greatly on the quality. of its encoding counterpart, motion prediction. Exhaustive search motion estimation is the most reliable method to predict the motion vector. However, this method suffers from its huge degree of complexity.
Many sub-optimum solutions have been proposed to alleviate the complexity of motion estimation. Most of them sacrifice the search quality to reduce the number of searches.
Full search motion estimation performs a search (also called block matching) for the block inside the search area in the reference picture that best describes the current block in the current picture. The displacement between the best-matched block and the current block, indicated by a motion vector, is later used in the motion compensation process to recover the current block. In other words, a block, B(z,t), at spatial position z and time t will be replaced by another block, B′ (z′,t′), at position z′ in the reference picture at time t′, and with a time difference, &igr; (=t−t′). The motion vector MV(z,t) in this case is the displacement between z′ and z. Hence,
B
(
z,t
)=
B
(
z′,t
′)=
B
(
z−MV
(
z,t
),
t
−&igr;);
and
MV
(
z,t
)=min(
D
(
B
(
z,t
),
B
(
z−MV
(
z,t
),
t
−&igr;))), ∀
z
′&egr; search area around
z.
Moreover, D(B(z,t),B(z′,t′)) is the prediction error, where “D” is a “delta”. The error can be first order, e.g., an absolute difference, second order, e.g., a square difference, or any higher order. However, the complexity of the calculations increases with higher orders of the prediction error.
Motion estimation is a computationally intensive process. The contribution from all pixels in the block has to be considered in the prediction error calculation. Furthermore, all possible blocks in the search area are also needed to be matched in order to obtain a reliable motion vector. In general, a total of (2n+1)
2
m
2
comparisons is involved in a motion estimation of an m×m block with the search area of ±n pixels. For example, 278,784 pixel comparisons or 1,089 block searches are required for m,n=16.
Moreover, motion estimation for arbitrarily-shaped video object presents still further challenges.
There are various simpler alternatives to full search block matching in the literature. Most of them use a coarse-to-fine search strategy, e.g.; hierarchical motion estimation, a three-step search, a logarithm search, and so forth. These fast search algorithms subsample the reference picture into various scales and perform a full search starting from the coarsest scale. The subsequent searches, which occur at the finer scale, are limited to the surrounding pixel of the previous motion vector. The same process is repeated until the final result at the actual scale is obtained. However, these modifications are sub-optimum since they may choose only a locally optimal solution, and they generally use a full search method as their benchmark.
Accordingly, it would be desirable to provide an improved, more efficient shape and texture motion estimation system for digital video objects. The system should exploit the irregular boundary of the object to reduce the number of searches. The system should also be general enough to apply with any fast block matching alternative. The system should be applicable to arbitrarily-shaped video coding algorithms, such as MPEG-4.
The system should provide a shaped search area that follows a shape of the video object being coded.
The system should be useable in an MPEG-4 encoder or other object-based encoder.
The present invention provides a system having the above and other advantages.
SUMMARY OF THE INVENTION
The invention relates to an efficient motion estimation technique for an arbitrarily-shaped video object that reduces the number of searches for motion estimation for shape coding and texture coding. The invention is particularly suitable for use in an MPEG-4 encoder for coding Video Object Planes (VOPs).
Essentially, the invention provides a technique for shaping the search area for motion estimation according to the shape of the video object being coded.
A method for motion estimation coding of an arbitrarily-shaped video object includes the step of: determining whether successive blocks of pixels of at least a portion of the reference video image are outside the video object, overlap the video object, or are inside the video object. Each block, such as an mxm block, has a respective reference pixel and a plurality of associated neighboring pixels.
Respective mask values corresponding to positions of the respective reference pixels in the reference video image are provided according to whether the associated blocks are outside the video object, overlap the video object, or are inside the video object. The respective mask values indicate a search region in the reference video image for motion-estimation coding of the video object that corresponds to a shape of the video object
The successive blocks are outside th
Chen Xuemin
Panusopone Krit
General Instrument Corporation
Le Vu
Lipsitz Barry R.
LandOfFree
Efficient motion estimation for an arbitrarily-shaped object does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Efficient motion estimation for an arbitrarily-shaped object, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Efficient motion estimation for an arbitrarily-shaped object will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2976310