Pulse or digital communications – Bandwidth reduction or expansion – Television or motion video signal
Reexamination Certificate
2000-03-23
2003-05-20
Le, Vu (Department: 2613)
Pulse or digital communications
Bandwidth reduction or expansion
Television or motion video signal
C348S699000
Reexamination Certificate
active
06567469
ABSTRACT:
BACKGROUND OF THE INVENTION
The present invention relates generally to digital video compression, and more particularly, to a hardware-efficient, high-performance motion estimation algorithm that has particular utility in H.261 digital video encoders.
Many different video compression algorithms have been developed for digitally encoding (“compressing”) video data in order to minimize the bandwidth required to transmit the digitally-encoded video data (“digital video data”) for a given picture quality. Several multimedia specification committees have established and proposed standards for encoding/compressing audio and video data. The most widely known and accepted international standards have been proposed by the Moving Pictures Expert Group (MPEG), including the MPEG-1 and MPEG-2 standards. Officially, the MPEG-1 standard is specified in the ISO/IEC 11172-2 standard specification document, which is herein incorporated by reference, and the MPEG-2 standard is specified in the ISO/IEC 13818-2 standard specification document, which is also herein incorporated by reference. These MPEG standards for moving picture compression are used in a variety of current video playback products, including digital versatile (or video) disk (DVD) players, multimedia PCs having DVD playback capability, and satellite broadcast digital video.
Although the MPEG standards typically provide high picture quality, the data rate/bandwidth requirements are far too great for some applications. Videoconferencing is a particular application that typically does not require the coding resolution afforded by MPEG because the picture content does not normally vary a great deal from picture-to-picture, e.g., most of the motion is confined to a diamond-shaped region in the picture where the head and shoulders of the conferee are located. In short, because there is so little motion in a sequence of moving pictures in a videoconferencing application, there is a great deal of redundancy from picture-to-picture, and consequently, the degree of video data compression which is possible for a given picture quality is much greater. Moreover, the available bandwidth for many videoconferencing systems is less than 2 Mbits/second, which is far too low for MPEG transmissions.
Accordingly, a collaboration of telecommunications operators and manufacturers of videoconferencing equipment developed the H.320 videoconferencing standards for videoconferencing over circuit-switched media like ISDN (Integrated Services Digital Network) and switched-56 connections. H.261 is the video coding component of this standard. It is also known as the P×64 standard since it describes video coding and decoding rates of p×64 kbits/second, where p is an integer from 1 to 30. Thus, the H.261 video coding algorithm compresses video data at data rates ranging from 64 kbits/second to 1,920 kbits/second. The H.320 standard was ratified in Geneva in December of 1990. This standard is herein incorporated by reference.
Like MPEG, the H.261 encoding algorithm uses a combination of DCT (Discrete Cosine Transform) coding and differential coding. However, only I-pictures and P-pictures are used. An I-picture is coded using only the information contained in that picture, and hence, is referred to as an “Intra-coded” or “Intra” picture. A P-picture is coded using motion compensated prediction (or “motion estimation”) based upon information from a past reference (or “anchor”) picture, and hence, is referred to as a “Predictive” or “Predicted” picture. In accordance with the H.261 standard, the compressed digital video data stream is arranged hierarchically in four layers: picture, group of blocks (GOB), macroblock (MB), and block. A picture is the top layer. Each picture is divided into groups of blocks (GOBs0. A GOB is either one-twelfth of a CIF (Common Intermediate Format) picture. Each GOB is divided into 33 macroblocks. Each macroblock consists of a 16×16 pixel array.
In short, just like MPEG, H.261 uses motion estimation to code those parts of sequential pictures that vary due to motion, where possible. More particularly, H.261 uses “motion vectors” (MVs) that specify the location of a “macroblock” within the current picture relative to its original location within the anchor picture, based upon a comparison between the pixels of the current macroblock and corresponding array of pixels in the anchor picture within a given N×N−pixel search range. In accordance with the H.261 standard, the minimum search range is +/−7 pixels, and the maximum search range is +/−15 pixels. It will be appreciated that using the maximum search range in all H.261 applications will not necessarily improve the quality of the compressed signal. In this regard, since H.261 applications can operate at various bit rates, ranging from 64 kbits/second to 1,084 kbits/second, the actual search range employed may vary. For example, at high bit rates, the temporal distance between adjacent pictures is smaller, and thus, a smaller search range can be used to achieve a given picture quality. At low bit rates, the situation is reversed, and a larger search range is required in order to achieve a given picture quality.
Once the motion vector for a particular macroblock has been determined, the pixel values of the closest-matching macroblock in the anchor picture identified by the motion vector are subtracted from the corresponding pixels of the current macroblock, and the resulting differential values are then transformed using a Discrete Cosine Transform (DCT) algorithm, the resulting coefficients of which are each quantized and Huffman-encoded (as is the motion vector and other information pertaining to and identifying that macroblock). If during the motion estimation process no adequate macroblock match is detected in the anchor picture (i.e., the differential value exceeds a predetermined threshold metric), or if the current picture is an I-picture, the macroblock is designated an “Intra” macroblock and the macroblock is coded accordingly.
The H.261 standard does not specify any particular implementation of the motion estimation algorithm employed. Otherwise stated, the H.261 leaves open the details of implementation of the motion estimation algorithm to the manufacturers of the videoconferencing systems. In general, various measures or metrics have been utilized and proposed to compute the location of the pixel array within the anchor picture that constitutes the closest match (i.e., minimum difference/error) relative to the current macroblock, and various motion estimation algorithms have been utilized and proposed to search for and locate the closest-matching macroblock in the anchor picture. These motion estimation (M.E.) algorithms are typically performed by software running on a processor, e.g., a TriMedia processor manufactured and sold by Philips Semiconductors that is tasked with the encoding of the video data in the videoconferencing system. The overarching goal is to locate the closest-matching macroblock in the anchor picture as quickly as possible, while minimizing the load on the processor to execute the algorithm, and maintaining an acceptable level of error/inaccuracy. The hardware/software that actually executes the motion estimation search algorithm is sometimes termed the “search engine”. In terms of the search engine, the overarching goal is to optimize its performance while minimizing the resources required to execute the motion estimation algorithm. Simply stated, the basic goal is to minimize compute effort and compute time.
Among the best-known criteria or metrics for evaluating the quality of a match are the Sum of the Absolute Differences (SAD) and the Sum of the Squared Differences (SSD). The SAD metric constitutes the sum of the absolute values of the differences of each of the N pixels in the current macroblock (N=256 for the case of a 16×16 macroblock) and the respective ones of the corresponding pixels of the comparison macroblock in the anchor picture under evaluation. The SSD metric constitutes
Koninklijke Philips Electronics , N.V.
Le Vu
Vodopia John
LandOfFree
Motion estimation algorithm suitable for H.261... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Motion estimation algorithm suitable for H.261..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Motion estimation algorithm suitable for H.261... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3034289