Image analysis – Image compression or coding – Adaptive coding
Reexamination Certificate
2000-02-18
2004-08-24
Do, Anh Hong (Department: 2624)
Image analysis
Image compression or coding
Adaptive coding
C382S232000, C382S251000
Reexamination Certificate
active
06782135
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates generally to digital video processing and, more particularly, to digital video compression.
2. Discussion of Prior Art
Data reduction occurs during various stages of digital video encoding. However, quantization—which provides one of the best data compression opportunities—is also perhaps the least well-understood.
A typical video encoder receives source data having an initial spatial resolution. Prior to actual coding, the source data is mapped to a typically lower resolution sampling grid (“down-sampled”), filtered and then analyzed for statistical coding metrics according to which coding is then conducted. During coding, an encode-subsystem compresses the pre-processed data, typically using conversion, quantization and other processing to modify successive pictures (e.g. frames, blocks, objects, etc.).
In MPEG-2, for example, block-based motion-compensated prediction enables the use of not only complete picture representations (i.e. intra or I-pictures), but also predicted (P and B) pictures represented by predicted intra-picture motion (“prediction data”) and predicted-versus-actual picture or “prediction error” data. The prediction error data is then converted using a discrete cosine transform or “DCT” and then quantized. During quantization, additional bitrate reduction is achieved by replacing higher resolution pictures with lower resolution (lower-bitrate) quantized pictures. Final coding and other processing also provide incremental data optimization.
While several factors can influence the bitrate that is devoted to each picture (e.g. using a corresponding quantization step size), a particularly promising one is perceptual masking. That is, the sensitivity of the human visual system (“HVS”) to distortion tends to vary in the presence of certain spatio-temporal picture attributes. It should therefore be possible to model the HVS perceptual masking characteristics in terms of spatio-temporal picture attributes. It should also be possible to determine appropriate quantization step-sizes for received pictures (e.g. in order to achieve a desired quality and/or bitrate) by analyzing the pictures, determining perceptually significant picture attributes and then applying the perceptual model.
The current understanding of perceptual masking is, however, limited and the HVS is considered so complex and the perception of quality so subjective as to elude accurate modeling. See, for example,
Digital Images and Human Vision,
MIT Press (1993);
MPEG Video Compression Standard,
Chapman and Hall (1996), and
Digital Video: An Introduction to MPEG-
2, Chapman and Hall (1997). Nevertheless, attempts have been made to provide some degree of perceptual modeling in order to exploit HVS perceptual masking effects.
For example, many encoders now incorporate a quantizer that modifies or “adapts” a rate-control based nominal quantization step size according to a block energy measurement.
FIG. 1
, for example, broadly illustrates a typical adaptive quantizer within an MPEG encoder. During quantization, rate controller
101
transfers to quantization-modifier
102
a nominal quantization value Q
Nom
, macroblock data and a macroblock-type parameter. Quantization-modifier
102
processes the macroblock data, typically using sum of differences from DC (“SDDC”) or variance techniques, and then transfers to quantizer
103
a modified quantization value, M
Quant
.
Within quantization-modifier
102
, formatter
121
organizes each received macroblock into 4 blocks, each block containing an 8-row by 8-column array of pixel values, p(r,c), according to the received (frame-or-field) type parameters. Next, block energy analyzers
122
a-d
perform an SDDC (or variance based) block energy analysis for each of the blocks, as given by equations 1 or 2 respectively:
SDDC
⁡
(
block
)
=
∑
r
,
c
=
0
7
⁢
&LeftBracketingBar;
p
⁡
(
r
,
c
)
-
mean
⁢
-
⁢
p
⁡
(
block
)
&RightBracketingBar;
Equation
⁢
⁢
1
⁢
:
Variance
⁡
(
block
)
=
∑
r
,
c
=
0
7
⁢
(
p
⁡
(
r
,
c
)
-
mean
⁢
-
⁢
p
⁡
(
block
)
)
2
.
Equation
⁢
⁢
2
⁢
:
Each block energy analyzer further maps the total block energy measure for a current block to a corresponding modification value according to equation 3,
Block quantization mod=(&agr;×
a
+mean(
a
))/(
a
+&agr;×mean(
a
)) Equation 3:
wherein “&agr;” is a multiplier (typically equal to 2) and “a” is the minimum block-SDDC or variance in a macroblock. Minimizer
123
next determines the minimum block quantization modification. The resultant minimum is then multiplied by Q
Nom
to produce M
Quant
, which is transferred to quantizer
103
.
Unfortunately, such a block-energy perceptual model provides only a rough approximation of how distortion generally tends to perceptually blend into a picture; it does not necessarily result in a minimized or well-distributed bitrate, and resulting decoded video often exhibits so-called halo effects, mosquito noise and other artifacts. Attempts to improve reliability—typically by formatting macroblocks in a finer 16×16 block array—not only substantially increase processing and storage requirements, but also provide only limited improvement.
Other HVS models have also been attempted. For example, one method attempts to detect characters (e.g. alphanumerics) that are particularly sensitive to distortion and then, when detected, to add appropriate “special case” quantization modifications to an existing perceptual masking model. Unfortunately, despite the sometimes extensive resources currently required for added detection and compensation, no commercially available encoder appears to include an accurate working HVS model, let alone an economically feasible one.
Accordingly, there remains a need for apparatus and methods capable of modeling the HVS and of enabling accurate and efficient video quantization in accordance with perceptual masking.
SUMMARY OF THE INVENTION
The present invention provides for accurate and efficient perceptually adaptive picture quantization and, among other capabilities, enables lower more optimally distributed bitrate video compression.
In one aspect, embodiments of the invention provide a perceptual model found to enable the determination of perceptual masking effects in a modifiable, yet accurate manner. In another aspect, received picture data and/or other information can be analyzed in accordance with perceptually significant picture attributes. Low-resource edge detection, as well as activity, luminance, temporal and positional perceptual significance determination and correlation (e.g. selection, combination, correspondence, etc.) are also enabled. Also provided are multiple-granularity (i.e. resolution, dimension, attribute, etc.) analysis and correlation, which are preferably used to produce perceptually-based quantization modifications.
In a preferred embodiment, adaptive quantization is provided within an MPEG-2 encoder integrated circuit (“IC”). Received picture data is analyzed to determine energy and edge attribute indicators and a multiple granularity correlation of the energy and edge attribute indicators is conducted to provide an activity-based quantization modification or “activity-modification.” The received picture data is also analyzed for luminance-sensitivity, and a resulting luminance-modification is correlated with the activity-modification and a further nominal-quantization offset (e.g. reflecting temporal-masking effects) to produce an intermediate modification. The intermediate modification is then limited. Finally, a positional-sensitivity determination is formed as a perimeter offset, which is correlated with the limited intermediate modification. The resulting positionally adapted modification is then rounded or truncated to produce a quantization modification, which is used by a quantizer in performing quantization.
Advantageously, embodiments of the present invention enable effective and efficient perceptual anal
Tong Zhijun
Viscito Eric
Conexant Systems, inc.
Do Anh Hong
Thomas Kayden Horstemeyer & Risley LLP
LandOfFree
Apparatus and methods for adaptive digital video quantization does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Apparatus and methods for adaptive digital video quantization, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Apparatus and methods for adaptive digital video quantization will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3353749