Pulse or digital communications – Bandwidth reduction or expansion – Television or motion video signal
Reexamination Certificate
2000-05-30
2001-10-23
Le, Vu (Department: 2613)
Pulse or digital communications
Bandwidth reduction or expansion
Television or motion video signal
C348S699000
Reexamination Certificate
active
06307887
ABSTRACT:
FIELD OF THE INVENTION
The invention relates generally to the encoding and decoding of video signals, and more particularly to an improved method and system for encoding and decoding video signals using bilinear motion compensation and lapped orthogonal transforms in a layered video compression system.
BACKGROUND OF THE INVENTION
Because of the massive amounts of data inherent in digital video, the transmission of full-motion, high-definition digital video signals is a significant problem in the development of high-definition television. More particularly, each digital image frame is a still image formed from an array of pixels according to the display resolution of a particular system. As a result, the amounts of raw digital information included in high-resolution video sequences are massive. The present technology is incapable of storing and transmitting these amounts of raw video information at a cost that would make digital television practical. Moreover, bandwidth limitations prohibit sending digital video information in this manner. Consequently, various video compression standards or processes have been established, including MPEG-1, MPEG-2, and H.26X.
In general, video compression techniques utilize similarities between successive image frames to provide interframe compression, in which pixel-based representations of image frames are converted to motion representations. In addition, the conventional video compression techniques utilize similarities within image frames, referred to as spatial or intraframe correlation, to provide intraframe compression in which the motion representations within an image frame are further compressed. Intraframe compression is based upon conventional processes for compressing still images, such as discrete cosine transform (DCT) encoding.
MPEG-2, the most prevalent compression technology, provides interframe compression and intraframe compression based upon square blocks or arrays of pixels in video images. A video image is divided into a plurality of transformation blocks, each block composed of 16×16 pixels. For each transformation block T
N
in an image frame N, a search is performed across the image of an immediately preceding image frame N−1 or also a next successive video frame N+1 (i.e., bidirectionally) to identify the most similar respective transformation blocks T
N+1
or T
N−1
.
Ideally, and with reference to a search of the next successive image frame, the pixels in transformation blocks T
N
and T
N+1
are identical, even if the transformation blocks have different positions in their respective image frames. Under these circumstances, the pixel information in transformation block T
N+1
is redundant to that in transformation block T
N
. Compression is achieved by substituting the positional translation between transformation blocks T
N
and T
N+1
for the pixel information in transformation block T
N+1
. A single transslational vector (ÄX,ÄY) is designated for the video information associated with the 256 pixels in transformation block T
N+1
.
In reality, however, the video information (i.e., pixels) in the corresponding transformation blocks T
N
and T
N+1
are rarely identical. The difference between them is designated a transformation block error E, which often is significant. Although it is compressed by a conventional compression process such as discrete cosine transform (DCT) encoding, the transformation block error E is cumbersome and limits the extent (ratio) and the accuracy by which video signals can be compressed.
Large transformation block errors E arise in block-based video compression methods for several reasons. The block-based motion estimation represents only translational motion between successive image frames. The only change between corresponding transformation blocks T
N
and T
N+1
that can be represented are changes in the relative positions of the transformation blocks. A disadvantage of such representations is that full-motion video sequences frequently include complex motions other than translation, such as rotation, magnification and shear. Representing such complex motions with simple translational approximations produces these significant errors.
Another aspect of video displays is that they typically include multiple image features or objects that change or move relative to each other. Objects may be distinct characters, articles, or scenery within a video display. With respect to a scene in a motion picture, for example, each of the characters (i.e., actors) and articles (i.e., props) in the scene could be a different object.
The relative motion between objects in a video sequence is another source of significant transformation block errors E in conventional video compression processes. Due to the regular configuration and size of the Transformation blocks, many or them encompass portions of different objects. Relative motion between the objects during successive image frames can result in extremely low correlation (i.e., high transformation errors E) between corresponding transformation blocks. Similarly, the appearance of portions of objects in successive image frames (e.g., when a character turns) also introduces high transformation errors E.
Conventional video compression methods appear to be inherently limited due to the size of transformation errors E. With the increased demand for digital video display capabilities, improved digital video compression processes are required.
At the same time, it has been shown that by splitting a video signal into a base stream and an enhancement stream, the amount of video data transmitted in a given time can be significantly increased. This technique, known as layered compression, provides a substantial improvement over conventional MPEG-2 transmission. However, even with this technique, MPEG-2 stills suffers from the same basic error problems, i.e., while it functions well for translational motion, more complex motion produces errors which break down the translational model. When the translational model breaks down, which it often does, a significant amount of information must be sent to correct the predictions. If the channel does not possess sufficient room for this information, then the predictions reconstructed at the receiver will be poor, cascading into ever-poorer predictions of subsequent frames. To reset the prediction process, DCT compressed (but still large and lossy data) Intra frames, or I-frames, are sent periodically, but to save on overall data transmission are only sent nominally every half-second. In short, with MPEG-2, the overall picture quality still suffers when complex notion is involved, and is at times unpleasant due to the sharp contrast between block edges.
OBJECTS AND SUMMARY OF THE INVENTION
Accordingly, it is a general object of the present invention to provide a method and system for improving the transmission and reconstruction f compressed video images.
A related object is to provide the method and system in a layered compression architecture.
In accomplishing those objects, it is a related object to provide a method and system as characterized above that substantially reduces the perceived edge boundaries between blocks of pixels in a reconstructed image.
Yet another object is to provide a method and system of the above kind that operates without increasing the amount of motion vector data transmitted while oftentimes reducing the amount of error correction data transmitted.
Briefly, the present invention provides an improved method and system for altering existing image data, such as from one video frame to the next, by employing a bilinear motion compensation system. The blocks of pixel information from a previous frame of blocks is preserved, and a plurality of vectors, preferably four, (three of which are from proximate blocks), are received and associated with each block. As each block is compensated for motion differences, the four vectors are used in a bilinear interpolation operation on each pixel in the block to determine an adjusted address for ea
Le Vu
Michalik & Wylie PLLC
Microsoft Corporation
LandOfFree
Video encoder and decoder using bilinear motion compensation... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Video encoder and decoder using bilinear motion compensation..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Video encoder and decoder using bilinear motion compensation... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2616100