Image analysis – Image compression or coding
Reexamination Certificate
1999-08-11
2004-08-24
Do, Anh Hong (Department: 2721)
Image analysis
Image compression or coding
C382S260000, C382S261000, C348S716000
Reexamination Certificate
active
06782132
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates generally to digital video and, more particularly to digital video coding and reconstruction.
2. Discussion of Prior Art
The recent introduction of digital video technology holds great promise for the future of multimedia. Unlike its analog predecessors, digital video is capable of being stored, transferred, manipulated, displayed and otherwise processed with greater precision by a wide variety of digital devices. Digital processing can also be more readily conducted in conjunction with various other digital media (e.g. graphics, audio, animation, virtual-reality, text, mixed media, etc.), and with more reliable synchronization and lower generational degradation.
Successful deployment of digital video is largely due to the wide adoption of digital video standards, such those espoused by the Moving Picture Experts Group (“MPEG specifications”). While often hindered by proliferated compatibility with analog conventions (e.g. interlace video) and other factors, standardized digital constructs nevertheless provide substantial compression via common video signals and produce conventionally “acceptable” perceived image quality.
FIG. 1
, for example, illustrates a typical standard-compliant, one-to-many encoder and deterministic-decoder pair or “codec.” As shown, codec
100
includes encoder
101
and decoder
103
, which are connected via communications system
102
. Operationally, pre-processor
111
typically receives, downscales and noise filters video source s to remove video signal components that might otherwise impede encoding. Next, encode-subsystem
113
compresses and codes pre-processed signal s′, producing encoded-signal b. Multiplexer
115
then modulates and mutiplexes encoded-signal b and transfers resultant signal b′ to communications subsystem
102
. Communications subsystem
102
(typically not part of codec
100
) can be a mere data transfer medium or can also include system interfaces and/or subsystems for combining, scheduling and/or delivering multiple singular and/or mixed media signals to a receiver. Decoder
103
typically operates at a receiver to reconstruct video source s. More specifically, signal b′ is demodulated and demultiplexed by demultiplexer
131
, decoded by decode-subsystem
133
and then post-processed (e.g. filtered, converted, etc.) by post-processor
135
. Following decoding, decoded-signal r′, which resembles the source signal s, is displayed and/or stored.
FIGS. 2 and 3
respectively illustrate encode-subsystem
113
and decode-subsystem
133
of
FIG. 1
in greater detail. Beginning with
FIG. 2
, the downscaled video signal s′ from pre-processor
111
(
FIG. 1
) is received, optionally formatted, and then stored in frame store
203
by capture unit
201
. Captured signals c′ are represented as a sequence of two-dimensional sample lattices corresponding to video frames. (The number of captured frames contemporaneously stored by frame store
203
is determined by encode-subsystem latency and the analysis window size utilized by analysis unit
202
.) Stored frames are transferred to analysis unit
202
and otherwise retrieved multiple times as needed for actual encoding. Analysis unit
202
, for example, measures standard-specific properties of each stored frame, which it transfers as metrics to decision unit
204
.
Next, the analysis unit metrics are inserted into an encoding formula, producing the coding modes according to which encode-subsystem
205
represents pre-processed frames as standard-compliant encoded-frames. More specifically, temporal prediction unit
207
retrieves frames from frame store
208
, uses captured-frames to form a coarse current-frame prediction and then refines this prediction according to prior-encoded frames. Decision unit
204
then uses the refined predictions and metrics to control current frame coding. Finally, encode unit
205
uses a current coding mode to form, on a frame-area (“macroblock”) basis, a coded frame.
Continuing with
FIG. 3
, a typical decode-subsystem
133
performs a simpler, deterministic operation than encode-subsystem
113
, using the frame-data of each encoded frame to determine the proper reconstruction of a corresponding decoded frame. (For clarity, elements complimentary to those the encode-subsystem of
FIG. 2
are correspondingly numbered.) Operationally, parsing engine
301
de-multiplexes the received variable length encoded-bitstream b. Thereafter, decode unit
305
provides spatial frame elements and temporal prediction unit
307
provides temporal frame elements which reconstruction unit
306
reconstructs into decoded frames. Frame store
303
provides for frame reordering of differentially-coded adjacent frames (discussed below) and can also serve as a frame-buffer for post-processor
135
(FIG.
1
).
In addition to current-frame prediction (above), standard-compliant codecs also provide for compression through differential frame representation and prediction error data. MPEG-2 coded video, for example, utilizes intra (“I”), predictive (“P”) and bi-directional (“B”) frames that are organized as groups-of-pictures (“GOPs”), and which GOPs are organized as “sequences.” Typically, each GOP begins with a I-frame and then two B-frames are inserted between the I frame and subsequent P frames, resulting in a temporal frame sequence of the form: IBBPBBPBB . . . I-frames represent a complete image, while P and B frames can be coded respectively as differences between preceding and bi-directionally adjacent frames (or on a macroblock basis). More specifically, P and B frames include motion vectors describing interframe macroblock movement. They also include prediction data, which describes remaining (poorly motion-estimated or background) macroblock spatial-pattern differences, and prediction error data, which attempts to fill-in for or “spackel” data lost to prediction inaccuracies. Prediction and prediction error data are also further compressed using a discrete cosine transform (“DCT”), quantization and other now well-known techniques.
Among other features, MPEG and other standards were intended to meet emerging coding needs. For example, they specify protocols rather than device configurations to enable emerging, more efficient protocol-compliant devices to be more readily utilized. (One purpose of GOPs, for example, is to avoid proliferation of drift due to differing decoder implementations by assuring periodic I-frame “refreshes.”) MPEG-2 further provides profiles and levels, which support emerging higher resolution video (e.g. HDVD, HDTV, etc.). Scalability modes are also provided. Much like adding missing prediction error data to prediction data, MPEG-2 scalability modes allow “enhancement” frame data to be extracted from “base” frame data during encoding (typically using a further encode-subsystem) and then optionally re-combined from the resulting base and enhancement “layers” during decoding.
Unfortunately, standards are ultimately created in hindsight by committee members who cannot possibly foresee all contingencies. Worse yet, new standards materialize slowly due to the above factors and a need to remain compatible with legacy devices operating in accordance with the existing standard.
For example, while current standard-compliant codecs produce generally acceptable quality when used with conventional standard-definition television (“SDTV”), resultant signal degradation is perceivable and will become even more so as newer, higher-definition devices emerge. Block-based coding, for example is non-ideal for depicting many image types—particularly images that contain objects exhibiting high velocity motion, rotation and/or deformation. In addition, standard compression is prone to over-quantization of image data in meeting bitrate and other requirements. Further, even assuming that an ideal low-complexity image well suited to block-based coding is supplied, image quality is nevertheless conventionally limited to that of the pre-processed signal. Defects
Do Anh Hong
Pixonics, Inc.
LandOfFree
Video coding and reconstruction apparatus and methods does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Video coding and reconstruction apparatus and methods, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Video coding and reconstruction apparatus and methods will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3308529