Apparatus and method for fast motion estimation

Image analysis – Applications – Motion or velocity measuring

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C382S236000, C348S154000, C348S155000

Reexamination Certificate

active

06269174

ABSTRACT:

BACKGROUND OF THE INVENTION
Motion estimation is an essential part of modern digital video encoding systems. Such standards as ISO MPEG-1 (formally known as ISO/IEC 11172), ISO MPEG-2 (formally known as ISO/IEC 13818), ITU-T H.261, ITU-T H.263, ISO MPEG-4 and many other video coding technologies utilize motion compensation as a mean for reducing data redundancy by transmitting components of motion vector (MV) for a set of rectangular pixel blocks (macroblocks in MPEG-1 and MPEG-2 terminology).
The process of searching the best suitable motion vectors for all macroblocks is called motion estimation (ME). It is generally known to be the most computationally complex element of video encoding systems. It is often the most expensive and power-consuming part of digital video encoding systems and the most processor intensive part of software video encoding and video conferencing applications. Similar procedure to ME is used in some other applications that involve searching for data correlation in images or 2-dimensional arrays, for example: image recognition, robotic vision, still image data compression and so on.
When the motion estimation procedure is completed, the digital video encoder stores the motion vector components for every pixel block of the video frame in the output data stream so that the video decoder will be able to apply motion compensation when reconstructing the video frame sequence.
The ME algorithm of is not a normative part of video encoding standard specifications, so the developer has some freedom in choosing a ME method. The purpose is to find a particular macroblock on the current video frame that is most similar to the macroblock on the previous video frame. As is known, a macroblock which is often used to refer to the pel data, contains the luminance data a chroma format. For example, a macroblock may comprise the four 8 by 8 blocks of luminance data and the two (for 4:2:0 chroma format), four (for 4:2:2 chroma format) or eight (for 4:4:4 chroma format) corresponding 8 by 8 blocks of chrominance data coming from a 16 by 16 section of the luminance component of the picture.
Often the search area is limited by some fixed maximum value of motion vector components. In many video encoding methods (for example MPEG-1 and MPEG-2), from a set of equally good macroblocks and motion vectors, which represent said macroblock's position, the one with the minimum length motion vector is preferable because it requires fewer bits to encode the motion vector components. A motion vector search requires a criterion specifying the best outcome. Usually mean squared error (MSE) or mean absolute distortion (MAD) is used as a block matching criterion. Some other criteria for block matching have been proposed. For the subject of the present invention the choice of block matching criterion is not significant.
Full or exhaustive spiral search (calculating matching criteria for every mackroblock of the current frame carried out in a spiral order within a limited search area on the previous frame) is usually assumed to give the best possible quality of ME. Unfortunately, full spiral search algorithms require a large number of calculations and are not suitable for many applications. For example, full spiral search ME for a picture size of 360×240 pixels within search area of +/−64 pixels and with a macroblock size of 16×16 pixels requires on the order of 10
10
integer arithmetic operations.
Many different approaches have been proposed to reduce the computational cost of ME. The proposed “fast ME” algorithms can be categorized into several groups, described below. All of them are based on assumptions about a particular type of continuity in the dependency of matching criterion from motion vector components, or some temporal or spatial correlation between the best possible motion vector values. Such assumptions do not always hold, as we will describe later. Such assumptions underly all known fast ME methods and limit their usefulness.
Fast ME with Unimodal Error Surface Assumption
Many kinds of fast ME algorithms restrict the number of tested MV by assuming that matching error increases monotonically when the point of the ME search moves away from the best possible ME position. However, this assumption does not usually hold for real input video signals. A real input video signal always contains some level of noise in its pixel values and often contains an unpredictable amount of small and large visible objects in the frame. As a result the search is often trapped to a local minimum position far from the global minimum of a targeting matching criterion.
One example of such an algorithm is the two-dimensional logarithmic search that described by T.Koga, K.Iinuma, A.Hirano etc. in “Motion compensated interframe coding for video conferencing”, Proc.Nat.Telecommunication Conf., pp. G5.3.1-5.3.5, Nov, 29-Dec. 3 1981. Another example of a method that is related to unimodal error surface assumption described in U.S. Pat. No. 5,623,313 to Naveen, entitled “Fractional pixel motion estimation of video signal”.
Fast ME with Pixel Subsampling
One class of fast ME algorithms, sometimes referred to as Fast ME with Pixel Subsampling, is based on limiting the number of pixels used in the calculation of matching criterion. It is based on the assumption that if the error in some subset of pixels in the macroblock is minimized, the targeting matching criterion is minimized as well. This assumption usually does not hold for real input video signals. In addition, this algorithm provides for a relatively small reduction in the number of required calculations, usually from 4 to 16 times, and further reduction is desired.
Hierarchical and Multiresolution Methods of Fast ME
Another group of fast ME methods is based on doing a preliminary search of motion vectors on a coarse-resolution frame and refining the predicted motion vector in the frame with fine resolution. This is known as a hierarchical method and uses several levels of search with the same image size, but different block sizes at every level, as proposed in M.Bierling, “Displacement estimation by hierarchical block matching”, Proc. SPIE Visual Communications and Image Processing 1988, vol. 1001, pp 942-951. This approach is based on the assumption that the motion vector obtained for the large block size is a good starting point for the search for smaller block sizes. However, this assumption is not always true and may not give a good initial motion vector.
A multiresolution method (also known as a pyramidal method) uses several coarse levels of image created from original image by reducing image resolution. In this approach, the search is performed first on the coarsest-resolution image for the proportionally reduced size of macroblock. Then, the best resulting motion vector is interpolated to obtain an initial value of motion vector for the next level. The motion vector is then refined, by searching in a small area on the next level.
A variation of this method was described in K.M.Uz, M.Verittiand D.LeGall, “Interpolative multiresolution coding of advanced television with comparable subchannels”, IEEE Trans. Circuits Syst. Video Technol., vol.1, pp.86-99, March 1991., and in S.Zafar, Y.Q.Zhang, B.Jabbari,“Multiscale video representation using multiresolution motion compensation and wavelet decomposition”, IEEE J.Select.Areas Commun., vol.11, pp.24-35, January 1993. Another example of hierarchical ME is described in U.S. Pat. No. 5,635,994 to Drexler et al, entitled “Method of making a hierarchical estimate of image motion in a television signal.” This class of ME methods is based on the assumption that the best motion vector for a given coarse resolution image level and proportionally reduced size of macroblock is a good estimation for the next finer level. This may work for smooth, relatively monotonous images, but not for images with fine details that are invisible at coarse resolutions. This approach is also less effective for input with significant levels of noise in the signal.
Fast ME Based on Assumption of Spatial

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Apparatus and method for fast motion estimation does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Apparatus and method for fast motion estimation, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Apparatus and method for fast motion estimation will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2436525

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.