Image analysis – Image compression or coding – Quantization
Reissue Patent
2000-06-16
2002-04-23
Mehta, Bhavesh (Department: 2621)
Image analysis
Image compression or coding
Quantization
C382S283000
Reissue Patent
active
RE037668
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to image-encoding methods, image-decoding methods, image-processing methods available for encoding, transmitting and accumulating images, especially regional images showing the occupancy region of a projective image of a substance, and devices thereof.
The present invention relates to a motion vector-detecting device used for image encoding and format transformation such as a frame frequency transformation, an image-encoding device for transmitting and recording images with a little encoded volume, and an image-decoding device.
The present invention relates to image-encoding methods for transmitting and accumulating images with a smaller encoded volume, and a device thereof.
2. Related Art of the Invention
Conventionally, when images are synthesized by computer graphics and the like, information relating to the opacity(transparency) of a substance referred to as “a value”, other than the luminance of the substance are required.
The &agr; value is determined for every pixel, and the &agr; value of 1 means non-opacity, and the &agr; value of 0 means complete opacity. Namely, when an image of a certain substance is embedded in the background, an image having the &agr; value is necessary. Hereinafter, the images having such &agr; values are referred to as “&agr; plane”. Incidentally, the &agr; value has an intermediate value of [0, 1] in the case of substances such as clouds, frosted glass and the like, but in many substances, it tends to have two values of {0, 1}.
Encoding of the &agr; plane may be conducted as direct enumeration of the pixel value, however, when the &agr; plane is composed of two values of {0, 1}, binary image-encoding techniques MH, MR, MMR encoding which are the international standard by CCITT and used conventionally for facsimile and the like may be used. These are named generally as “run-length coding”.
In the run-length coding, pixel number of horizontally or horizontally/vertically continuous 0 and 1 is entropy-coded to perform coding efficiently.
Furthermore, taking notice of the contour of substance boundary, positional informations of each pixel constituting the contour may be coded. In the present specification, encoding of the contour of substance boundary is hereinafter referred to as contour encoding.
As typical contour encoding, there can be mentioned a chain enconding (described in H. Freeman: “Computer Processing of line drawing data”, Computing Surveys, vol. 6, no. 1, pp. 57-96, (1974)).
In an image having a simple contour of the substance boundary, the value of &agr; plane can be encoded highly efficiently by chain-coding the group of each pixel constituting the contour of the region having the &agr; value of 1.
Considering the visual characteristics affected by the decoded result of &agr; plane, there has been a defect in that in the above-mentioned run-length coding method and the chain coding method and the devices thereof, since encoding/decoding are carried out for every pixel, patterns of {0, 1} are coded/decoded accurately more than required from the view point of human visual characteristics, though it is not necessarily required to decode the pattern of {0,1} accurately, thereby a large coded volume becomes necessary.
Namely, concretely explained, in a general image synthesizing, a processing to mix the image with the color value of the background image referred to as “anti-aliasing” is performed in the vicinity of boundary of the image to be synthesized. This is equal to smooth the &agr; value in the vicinity of the substance boundary, considering the &agr; value to be a gray scale of [0, 1] equivalently. Namely, in the image such as &agr; plane, the space resolution is not so required. Instead, the amplitude resolution becomes necessary in the vicinity of the substance boundary.
In the conventional run-length coding and chain coding, there has been a problem in that since they are reversible coding, the space resolution is more than necessary from the view point of visual characteristics, thereby a large coded volume becomes necessary.
Furthermore, there has been conventionally proposed a method to encode dynamic images by resolving the dynamic image into layer image, as shown in
FIG. 31
, in order to efficiently perform opacity and recording of the dynamic image, by J. Wang and E. Adelson.
According to the literature “Layered Representation for Image Sequence Coding” by J. Wang and E. Adelson, Proc. IEEE Int. Conf. Acoustic Speech Signal Processing '93, pp. V221-V224, 1993, and “Layered Representation for Motion Analysis” by J. Wang and E. Adelson, Proc. Computer Vision and Pattern Recognition, pp. 361-366, 1993, in which this method is disclosed, the image processings of from (1) to (3) described below are performed:
(1) A region described by the same motion parameter (in the conventional case, affine transformation parameter) is extracted from the dynamic images.
(2) A layer image is formed by superposing the same motion region. Each layer image is expressed by the opacity and luminance for every pixel showing the occupancy of the superposed region.
(3) The upper and lower relations in the eyes' direction between layer images are examined and sequenced.
Here, the affine transformation parameter means the coefficient of a
0
-a
5
shown in Expression 1, when the horizontal/vertical position in the image is assumed to be (x, y), and the horizontal/vertical component of the motion vector is assumed to be (u, v).
(u(x,y), &ngr;(x,y))=(&agr;
0
+&agr;
1
x+&agr;
2
y, &agr;
3
+&agr;
4
x+&agr;
5
y) (1)
It is known that the motion of the projective image of a rigid body located with a sufficient distance from a camera can be approximated by the affine transformation parameter. They utilize this to synthesize dynamic images of from several tens to several hundreds of frames, while transforming several kinds of layer images composed of one frame by the affine transformation. The informations required for transmitting and recording this dynamic image are only the image which is the base of deformation relating to each layer image (hereinafter referred to as “template”), the affine transformation parameter, and the upper and lower relations of each layer image, therefore, recording and opacity of the dynamic image can be performed at a very high coding efficiency. In addition, the template is expressed by the opacity and the luminance for every pixel showing the occupancy of the region, for the image synthesis.
in the dynamic image expression by J. Wang and E. Adelson, the projective image deals with only the motion of a rigid body which can be described by the affine transformation. Therefore, their dynamic image expression cannot cope with the case where the motion of the projective image cannot be described by the affine transformation. For example, when a person shown in
FIG. 31
conducts a motion of non-rigid body, if the camera-substance distance is small and the nonlinear item of perspective transformation cannot be ignored, it cannot be applied thereto. Moreover, their technique to determine the motion of projective image as the affine transformation parameter is composed of processings of two stages described below:
1. To determine a local motion vector at respective positions on the screen by a method based on the relational expression of space-time gradient of the luminance that the time change of the luminance can be approximated by the space luminance gradient and the inner product of the motion vector (B. Lucas and T. Kanade: “An Iterative Image Registration Technique with Anaplication to Stereo Vision”, Proc. Image Understanding Workshop, pp. 121-130, April 1981).
2. To determine the affine transformation parameter by clustering the obtained motion vector.
In the above-mentioned technique, however, it cannot be applied when there is a bit motion in the dynamic image such that the relational expression of the time-space gradient of the lu
Dang Duy M.
Matsushita Electric - Industrial Co., Ltd.
Mehta Bhavesh
Smith , Gambrell & Russell, LLP
LandOfFree
Image encoding/decoding device does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Image encoding/decoding device, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Image encoding/decoding device will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2847828