Image analysis – Color image processing – Compression of color images
Reexamination Certificate
1996-01-24
2001-11-27
Nguyen, Madeleine (Department: 2622)
Image analysis
Color image processing
Compression of color images
C382S233000, C358S539000
Reexamination Certificate
active
06324301
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates generally to image processing systems and, more particularly, to coding schemes where color image signals are transmitted between encoding and decoding devices.
2. Description of the Related Art
Discrete Cosine Transform (DCT) based coding schemes lie at the heart of most modern image communications applications, including videophones, set-top-boxes for video-on-demand, and video teleconferencing systems. With discrete cosine transformation, an image, or more properly a rectangular array of samples representing an image component, is divided into several square image blocks with each block consisting of a spatial array of n×n samples. The image data samples in each image block are then encoded with an orthogonal transform using the cosine function.
In the Discrete Cosine Transform (DCT) method, the signal power of each image block is concentrated at specific frequency components. The distribution of signal power at those frequency components is then encoded by expressing it as a set of scalar coefficients of the cosine transform. By encoding only those coefficients, with correspondingly high concentrations of signal power, the amount of information, or data (i.e. data volume) which needs to be transmitted or recorded for representation of the original image is significantly reduced. In this manner, the image data information is encoded and compressed for transmission.
One problem with this method of image transmission is that the distribution of the coefficients of the signal power produced by the discrete cosine transform, directly affects the image coding efficiency and thus the compressibility of the data which needs to be transmitted. For example, when the video image to be coded is a flat pattern image, such as an image of the sky on a clear day, the discrete cosine transform coefficients (DCT coefficients) are concentrated in the low frequency components. As a result, the image information can be compressed and transmitted using a small number of coefficients, by merely coding the coefficients corresponding to low frequency components.
However, whenever the video image to be coded includes either contours, edges, or strongly textured patterns, such as a plaid pattern, the DCT coefficients are distributed broadly among both low and high power frequency components, requiring that a large number of coefficients be transmitted, thus reducing the coding efficiency and limiting the ability to transmit compressed image information on a low bit rate channel. To solve this problem, techniques such as coarsening (“rounding-off”) the values of the DCT coefficients, or discarding the high frequency component coefficients have been employed to reduce the volume of data to be transmitted, thereby increasing the ability to transmit compressed video images. These techniques, however, when employed, produce decoded images that can be strongly distorted when compared to the original images. One type of commonly occurring distortion is referred to as “mosquito noise”, since its appearance in a decoded video segment gives the illusion of “mosquitoes” closely surrounding objects. “Mosquito noise” is caused by the coarse quantization of the high frequency component coefficients which are generated from contours, edges, or strongly textured patterns contained in the original video image.
In order to reduce distortions, including “mosquito noise”, postfilter arrangements, such as the one illustrated in
FIG. 1
, have been developed. In
FIG. 1
, there is shown in block diagram format, an example of a typical prior art postfilter arrangement in which image or video information is encoded in an encoder
110
and decoded in a decoder
120
. An input signal from a conventional video camera such as the View Cam, manufactured by Sharp Corporation, is provided over line
101
to an encoder unit
111
in encoder
110
. Encoder unit
111
codes the images received via the input signal and produces a compressed bitstream which is transmitted over communication channel
102
. Communication channel
102
has a low transmission rate of, for example, 16 kilobits/second.
Coupled to communication channel
102
, is decoder
120
which includes decoder unit
121
and postfilter
122
. Decoder unit
121
is used to decompress the received bitstream and to produce decoded images. The decoded images are then improved by postfilter
122
, which uses postfilter parameters to adjust filter strength, thus removing some of the distortions in the decoded video images that were produced in encoder
110
when the bitstream was compressed for transmission. The postfilter parameters used to adjust postfilter
122
, are determined based on a combination of the frame rate and transmission rate of the encoded bit stream and are obtained from an empirical lookup table located in the decoder.
In general, postfilter arrangements, such as the one depicted in
FIG. 1
, tend to either over-filter decoded video images that are “clean” to begin with (i.e. fairly free of distortions), thereby unnecessarily blurring edges and texture, or to under-filter video images that are very noisy”, leaving many of the stronger distortions in place. This is because the postfilter parameters used to control the strength of the postfilter, such as postfilter
122
, are not determined based on the DCT coefficient quantization errors of the video images generated by the encoder, but rather are adjusted based on a combination of frame and transmission rates for the transmitted bitstream.
Another problem associated with the DCT method of coding video images, in low bit rate systems, is that the distortions which are produced during coding tend to affect various areas of the image without discrimination. Viewers of such decoded video images tend to find distortions to be much more noticeable in areas of interest to them. For example, in typical video teleconferencing or telephony applications the viewer will tend to focus his or her attention to the face(s) of the person(s) in the scene, rather than to other areas such as clothing and background. Moreover, even though fast motion in a coded image is known to mask coding distortions, the human visual system has the ability to “lock on” and “track” particular moving objects in a scene, such as a person's face. A postfilter arrangement such as the one illustrated in
FIG. 1
, when applied to distorted video images which contain facial regions, may result in facial features being overly smoothed-out giving faces an artificial quality. For example, fine facial features such as wrinkles that are present in the original video image could be erased in a decoded video image. Based on the above reasons, communication between users of very low bitrate video teleconferencing and telephony systems tend to be more intelligible and psychologically pleasing to the viewers when facial features are not plagued with too many coding distortions.
SUMMARY OF THE INVENTION
In accordance with the invention, there is provided an arrangement for adaptively postfiltering a decoded video image, wherein the postfilter parameters used to control the strength of the postfilter are computed by the encoder at the time of encoding and are transmitted to the postfilter via the decoder, as side information, in the video image bitstream. This postfiltering process removes distortions from the decoded video images, derived as a result of DCT coefficient quantization errors produced when the image is compressed for transmission to a decoder, and is based on computations of signal-to-noise ratios (SNRs), of one or more components of encoded video images. Other information about image content, such as face location information, can also be included in the side information sent to the postfilter in the video image bitstream, to modulate the postfilter strength according to the image content.
In an adaptive postfilter encoding arrangement input video images are provided to an encoder. The encoder codes the video images and computes the signal-to-noise ra
Jacquin Arnaud Eric
Okada Hiroyuki
Lucent Technologies - Inc.
Nguyen Madeleine
LandOfFree
Adaptive postfilter for low bitrate visual telephony noise... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Adaptive postfilter for low bitrate visual telephony noise..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Adaptive postfilter for low bitrate visual telephony noise... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2579797