Image analysis – Image compression or coding
Reexamination Certificate
1999-02-19
2001-09-25
Couso, Jose L. (Department: 2621)
Image analysis
Image compression or coding
Reexamination Certificate
active
06295375
ABSTRACT:
FIELD OF THE INVENTION
The present invention relates to a method of coding a sequence of pictures comprising at least the steps of:
subdividing each input picture into sub-pictures;
quantizing said signals with a variable quantizing scale;
encoding said quantized signals; and to a coding device for carrying out said method. This invention may be used particularly for the implementation of MPEG-2 encoders.
BACKGROUND ART
The main principle of image compression techniques is to remove spatial and temporal data redundancy. To this end the MPEG standard, for instance, is based on the two following techniques: discrete cosine transform (DCT) and motion compensation (as described for example in the following document “MPEG video coding: a basic tutorial introduction”, S. R. Ely, BBC Report RD 1996/3).
A conventional MPEG-2 encoder mainly comprises, as indicated in
FIG. 1
, a formatting circuit
11
, receiving each digitized picture of the concerned video sequence and intended to subdivide a picture signal—composed of a bidimensional array of picture elements, or pixels—into disjoint sub-pictures or blocks of smaller size (8×8 or 16×16 pixels), a DCT circuit
12
, intended to apply to each block of pixels a bidimensional discrete cosine transform (the transform coefficients thus obtained being generally normalized to a predetermined range), a quantization circuit
13
intended to compress by thresholding and quantization (with a variable quantizer scale) the bidimensional array of the transform coefficients thus obtained for each block of pixels, a variable length encoding circuit
14
, and a motion-compensated prediction circuit
15
. Said prediction circuit finds for each block a motion vector matching this block to another one in the previous picture of the sequence, displaces said previous block according to the motion vector, and subtracts (the subtracter is here assumed to be included into the prediction circuit
15
) the predicted picture thus obtained from the current one for delivering the difference picture that will be transformed, quantized and coded. Moreover, a picture type defines which prediction mode, I, P, or B, will be used to code each macroblock: I type corresponds to I-pictures coded without reference to other pictures, P type to P-pictures coded using motion-compensated prediction from a past I- or P-picture, and B type to B-pictures using both past and future I- or P-pictures for motion compensation. A buffer
16
allows to store the output coded signals and to smooth out the variations in the output bit rate, and a rate control and quantizer scale variation circuit
17
, provided between said buffer and the quantization circuit
13
, allows to adjust the variable quantizer scale.
However, in most image processing systems, the final observer of the perceived images is the human eye. Image coding schemes incorporating the human visual system (called HVS in the following part of the description) may be proposed, in which the HVS model is adapted to a coding scheme based on the MPEG-2 standard, in order to obtain more pleasant images. An HVS model, whatever its complexity, must represent the visual processings performed by the human eye and has therefore to determine whether an image area is visually sensitive or not.
Many proposed HVS models rely on two key concepts: the contrast, and the masking, these two processings being performed sequentially by the HVS. It is known, indeed, that the human eye is sensitive to the luminance contrast across an image. Processings performed by the visual cortex do not apply to the absolute light level but to the contrast, defined as the ratio of the local intensity information over the average image intensity. One of the simplest definitions of the contrast C is given by the Weber's law:
C
=
Δ
⁢
⁢
L
LB
(
1
)
where L is the luminance difference to the background and LB is the background luminance. In case of more complex pictures, another contrast definition may be given: it is then defined as the ratio of a band-limited version of the picture—which is decomposed by the HVS into a set of sub-pictures expressed in several frequency bands and various orientations—over the mean luminance contained in the lower remaining frequency bands (when such a multi-resolution HVS model is thus considered, the contrast assessment requires two steps, a first one for decomposing the picture into a set of sub-pictures at various scales and orientations, with a pyramidal decomposition such as the Simoncelli pyramid, and a second one for computing the contrast for each scale and each orientation). The masking effect is then taken into account through a masking function which is applied to the obtained contrast information; this effect corresponds to the variation of a stimulus visibility threshold as a function of the luminance present in the neighbourhood of this stimulus. In other words, there is masking when a signal (the stimulus) cannot be seen because of the presence of another signal with similar characteristics but at a higher level (here, the background luminance around this stimulus).
Computations based on these two concepts (contrast, masking) finally allow to obtain perceptual measures for each pyramid band. Assuming that the relation between the DCT domain and the pyramidal frequency domain is linear,
p
erceptual
w
eighting factors (PWF) for each DCT basis function of each block are derived (by computation) from the perceptual measures obtained for each frequency and orientation band. This information may be exploited to allocate more bits to encode most visually sensitive areas and less bits to encode other areas of the same picture. An encoder of this type is described for instance in the European patent application EP 0535963. In said encoder, a quantization control circuit generates for each block a quantization control signal that detects a degree of influence on visual sensation for each block and then allows to specify an appropriate quantization step size received by a quantization circuit.
SUMMARY OF THE INVENTION
The object of the invention is to improve the visual quality obtained by means of such an adaptive quantization.
To this end the invention relates to a coding method such as defined in the preamble of the description, said method being further characterized in that it also comprises, before said quantizing step, the additional sub-steps of:
generating from each input picture a set of visual sensitivity values S(i) respectively associated to sub-pictures i of said input picture;
computing from said set of values perceptual coefficients W(i), one per sub-picture, said computation being based on the cumulative distribution function F(S(i)) associated to said values S(i) and according to the following expression:
W
(
i
)=(1
+a
/2)−(
a.F
(
S
(
i
)))
where a is a constant provided for controlling the modulation amplitude.
The invention also relates, for carrying out said method, to a device for coding a sequence of pictures comprising at least formatting means for subdividing each input picture into sub-pictures, quantization means, provided for compressing by thresholding and quantization a digital bitstream corresponding to said pictures, encoding means, provided for coding the output signals of said quantizing means, and rate control and quantizer scale variation means, provided for ensuring a constant bit rate at the output of said coding device, characterized in that said device also comprises, in series between its input and said quantizing means, bit reallocation control means including:
means for generating from each input picture a set of visual sensitivity values, S(i) respectively associated to sub-pictures i of said input picture;
means for computing from said set of values perceptual coefficients W(i), one per sub-picture, said computation being based on the cumulative distribution function F(S(i)) associated to said values S(i) and according to the following expression:
W
(
i
)=(1
+a
/2)−(
a.F
(
S
(
i
)))
where a is a constant provid
Couso Jose L.
Gross Russell
U.S. Philips Corporation
LandOfFree
Method and device for coding a sequence of pictures does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and device for coding a sequence of pictures, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and device for coding a sequence of pictures will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2511887