Image analysis – Image compression or coding – Adaptive coding
Reexamination Certificate
1999-09-03
2003-08-05
Johnson, Timothy M. (Department: 2623)
Image analysis
Image compression or coding
Adaptive coding
C375S240020, C382S240000
Reexamination Certificate
active
06603883
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to an image processing apparatus and method therefor. More specifically, the present invention relates to an image processing apparatus for encoding and decoding image data and to a method of encoding and decoding the same.
2. Related Background Art
JPEG (Joint Photographic Coding Experts Group), H.261, and its improvement MPEG (Moving Picture Experts Group) exist as international standards for the encoding of sound and image data. To handle integrated sounds and images in the current multi-media age, MPEG has been improved to MPEG1, and MPEG1 has undergone further improvement to MPEG2, both of which are currently in widespread use.
MPEG2 is the standard for moving picture encoding which has been developed to respond to the demands for high image quality. Specifically:
(1) it can be used for applications ranging from communications to broadcasting, in addition to stored media data,
(2) it can be used for images with much higher quality than standard television, with the possibility of extension into High Definition Television (HDTV),
(3) unlike MPEG1 and H.261, which can only be used with non-interlaced image data, MPEG2 can be used to encode interlaced images,
(4) it possesses scalability, and
(5) an MPEG2 decoder is able to process an MPEG1 bit stream; in other words, it is downwardly compatible.
Of the five characteristics listed, especially, item (4), scalability, is new to MPEG2, and roughly classified into three types, spatial scalability, temporal scalability, and signal to noise ratio (SNR) scalability, which are outlined below.
Spatial Scalability
FIG. 1
shows an outline of spatial scalability encoding. The base layer has a small spatial resolution, while the enhancement layer has a large spatial resolution.
The base layer consists of spatial sub-sampling of the original image at a fixed ratio, lowering the spatial resolution (image quality), and reducing the encoding volume per frame. In other words, it is a layer with a lower spatial resolution image quality and less code amount. Encoding takes place by using inter-frame prediction encoding within the base layer. This means that the image can be decoded from only the base layer.
On the other hand, the enhancement layer has a high image quality for spatial resolution and large code amount. The base layer image data is up-sampled (averaging, for example, is used to add a pixel between pixels in the low resolution image, creating a high resolution image) to generate an expanded base layer with the same size as the enhancement layer. Encoding takes place using not only predictions from an image within the enhancement layer, but also predictions taken from the up-sampled expanded image. Therefore it is not possible to decode the image from only the enhancement layer.
By decoding image data of the enhancement layer, encoded as described above, an image with the same spatial size as the original image is obtained, the image quality depending upon the rate of compression.
The use of spatial scalability allows two image sequences to be efficiently encoded, as compared to encoding and sending each image separately.
Temporal Scalability
FIG. 2
shows an outline of temporal scalability encoding. The base layer has a small temporal resolution, while the enhancement layer has a large temporal resolution.
The base layer has a temporal resolution (frame rate) that has been provided by thinning out the original image on a frame basis at a constant rate, thereby lowering the temporal resolution and reducing the amount of encoded data to be transmitted. In other words, it is a layer with a lower image quality for temporal resolution and less code amount. Encoding takes place using inter-frame prediction encoding within the base layer. This means that the image can be decoded from only the base layer.
On the other hand, the enhancement layer has a high image quality for temporal resolution and large code amount. Encoding takes place using prediction from not only I, P, B pictures within the enhancement layer, but also the base layer image data. Therefore it is not possible to decode the image from only the enhancement layer.
By decoding image data of the enhancement layer, encoded as described above, an image with the same frame rate as the original image is obtained, the image quality depending upon the rate of compression.
Temporal scaling allows, for example, a 30 Hz non-interlaced image and a 60 Hz non-interlaced image to be sent efficiently at the same time.
Temporal scalability is currently not in use. It is part of a future expansion of MPEG2 (treated as “reserved”).
SNR Scalability
FIG. 3
shows an outline of SNR scalability encoding.
The layer having a low image quality is referred to as a base layer, whereas the layer having a high image quality is referred to as an enhancement layer.
The base layer is provided, in the process of encoding (compressing) the original data, for example, in dividing it into blocks, DC-AC converting, quantizing and variable length encoding, by compressing the original image at relatively high compression rate (rough quantum step size) to result in less code amount. That is, the base layer is a layer with a low image quality, in terms of (N/S) image quality, and less code amount. In this base layer, encoding is carried out using MPEG1 or MPEG2 (with predictive encoding) decided to each frame.
On the other hand, the enhancement layer has a higher quality larger code amount than the base layer. The enhancement layer is provided by decoding an encoded image in the base layer, subtracting the decoded image from the original image, and intraframe encoding only the subtraction result at a relatively low compression rate (with a quantizing step size smaller than in the base layer). All encoding in SNR scaling takes place within the frame (field). No inter-frame (inter-field) prediction encoding is used. The entire encoding sequence is performed intra-frame (intra-field).
Using SNR scalability allows two types of images with differing picture quality to be encoded or decoded efficiently at the same time.
However, previous designs of encoding devices is not provided an option to freely select the size of the base layer image in spatial scalability. The image size of the base layer is a function of relationship between the enhancement layer and the base layer, and hence is not allowed to vary.
In addition, SNR scalability devices have faced similar limitations. The base layer frame rate is determined uniquely as a function of the enhancement layer, and the size of the base layer image could not be freely selected.
Therefore, previous encoding devices have not allowed one to select code amount, such as an image size and a frame rate, when using the scalability function. One could not select any factor directly related to the condition of the decoding device or the lines on the output side.
In other words, when an encoded image data is output from an encoding device employing spatial scalability or SNR scalability to a decoding device (receiving side), image quality choices are limited to:
1) a low quality image decoded from the base layer only, or
2) a high quality image provided by decoding both the base layer and the enhancement layer.
Accordingly, there is no opportunity to select image quality (decoding speed) in accordance with the capabilities of the decoding device or the needs of an individual user, which is a problem not addressed previously.
In addition, recent advances have taken place in the imaging field related to object encoding. MPEG4, currently being advanced as the imaging technology standard, is a good example. MPEG4 splits up one image into a background and several objects which exist in front of that background, and then encodes each of the different parts independently. Object encoding enjoys many benefits.
If the background is a relatively static environment and only some of the objects in the foreground are undergoing motion, then the background and all objects that do not move do not need to be re-en
Canon Kabushiki Kaisha
Fitzpatrick ,Cella, Harper & Scinto
Johnson Timothy M.
LandOfFree
Image processing apparatus including an image data encoder... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Image processing apparatus including an image data encoder..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Image processing apparatus including an image data encoder... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3074642