Image analysis – Image compression or coding – Transform coding
Reexamination Certificate
1999-01-06
2002-05-21
Mehta, Bhavesh (Department: 2621)
Image analysis
Image compression or coding
Transform coding
C382S250000, C375S240260, C358S438000
Reexamination Certificate
active
06393156
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates generally to methods and apparatus for improving image quality in image and video compression systems. More particularly, the present invention relates to a technique that employs pre- and post-processing to allow alternative transforms (such as Wavelet Transform (WLT), Wavelet Packet Transform (WPT), Lapped Orthogonal Transform (LOT), Generalized Lapped Orthogonal Transform (GenLOT), and Generalized Lapped Biorthogonal Transform (GLBT)) to be used with standard-compliant video compression coders (for example JPEG/MPEG/H.26x standards).
2. Background of the Related Art
The trend of visual communication has evolved rapidly in recent years as a result of advances in hardware technology, as well as the proliferation of multimedia-based software applications, such as Internet browsers. Standards for transmitting and storing visual information also encourage rapid deployment of interchangeable multimedia consumer products.
Contemporary high-quality image/video compression techniques commonly employ some form of forward and inverse transforms. Widely-used image and video compression standards include the JPEG, MPEG, H.261, and H.263 compression techniques, which are based on the Discrete Cosine Transform (DCT). The well-known JPEG image encoding technique, developed by the Joint Photographic Expert Group, is widely used in image compression software and hardware. As illustrated in the block diagram of
FIG. 1
, the image is divided into a number of 8×8 blocks of data elements
40
, each of which is then transformed at a transform process
42
using a 2-dimensional DCT. The transform coefficients are next arranged into 64 sub-bands at spectrum estimator
44
, are scalar-quantized at quantizer
46
, adaptively baseline-coded and Huffman-coded at coder
48
, and stored in memory
50
.
The well-known MPEG video encoding technique, developed by the Motion Pictures Expert Group, achieves a high compression ratio, and a corresponding significant bit rate reduction, by taking advantage of the correlation between adjacent pixels in the spatial domain (using the DCT), and the correlation between image frames in the time domain (using motion estimation and prediction).
The JPEG technique yields good results for compression ratios of 10:1 and below (assuming 8-bit gray-scale images); however, at higher compression ratios, the underlying block nature of the transform begins to manifest itself in the compressed image. As compression ratios approach 24:1, only the DC (lowest frequency) coefficient has data bits allocated to it, and, at this ratio, the input image has been approximated by a set of 8×8 blocks. The reconstructed image therefore will exhibit blocking artifacts.
Several transforms with overlapping basis have been proposed for addressing the blocking artifacts. Among them are the Lapped Orthogonal Transform (LOT),
Signal Processing with Lapped Transforms,
H. S. Malvar, Norwood, Mass.: Artech House, 1992; the Generalized Lapped Orthogonal Transform (GenLOT),
The GenLOT: Generalized Linear-Phase Lapped Orthogonal Transform,
R. L. de Queiroz, T. Q. Nguyen and K. R. Rao, IEEE Transaction on Signal Processing, V44, N3, pp. 497-507, March 1996; the Wavelet Transform (WLT),
Wavelets and Filter Banks,
G. Strang and T. Nguyen, Wellesley-Cambrige Press, 1996; and the Generalized Lapped Biorthogonal Transform (GLBT),
The Generalized Lapped Biorthogonal Transform,
T. Tran, R. deQueiroz and T. Nguyen, Proceeding of the IEEE International Conference in Acoustics, Speech and Signal Processing, April 1998.
These overlapping-basis transforms reduce blocking artifacts by borrowing pixels from adjacent blocks to produce the transform coefficients of the current block.
FIG. 2
depicts the aforementioned process for the case of the 8-channel LOT where the basis functions of the forward and inverse transforms have a length of 16.
Referring to
FIG. 2
, the transformed sequences X
DCT
(k)
54
A and X
LOT
(k)
54
B, both of length M, are computed as:
X
DCT
(k)=
T
DCT
·x
DCT
(n), DCT Processing (1a)
X
LOT
(k)=
T
LOT
·x
LOT
(n), LOT Processing (1b)
where T
DCT
and T
LOT
represent matrices consisting of M-basis functions for the DCT and LOT forward transforms
53
respectively. The sizes of the T
DCT
and T
LOT
matrices are M by M and M by 2M, respectively. The vectors x
DCT
(n)
52
A and x
LOT
(n)
5
B, of sizes M and 2M respectively, contain appropriate samples from the input image. The reconstructed sequence {circumflex over (x)}
DCT
(n)
56
A and {circumflex over (x)}
LOT
(n)
56
B can be defined similarly by applying the inverse transform
55
to the transformed sequences
54
A,
54
B. Note that the above description can be extended to two-dimensional sequences (as images), to three-dimensional sequences (as group of images or video) and to multidimensional images. The sizes of the transform can also be arbitrary (not necessarily 2M as in the LOT processing case), as in GenLOT, GLBT and wavelet processing.
With application of the overlapping basis function transform, blocking artifacts are substantially reduced or eliminated. However, these overlapping transforms are not standard compliant and therefore are not compatible for use with compression standards such as JPEG, MPEG, H.261, H.263, etc., since the standards are fixed, and do not allow for a change of basis function. Furthermore, the transform operations are commonly embedded in the coder hardware or software in a manner that does not allow for user access. Once widely deployed in the form of an application specific integrated circuit (ASIC) or software, alteration of the transform is nearly impossible.
SUMMARY OF THE INVENTION
The present invention is directed to an apparatus and method for image/video enhancement. More particularly, the present apparatus and method employ pre-processing and post-processing techniques to effectively modify the transforms used in a fixed, standardized coder. In this manner, alternative transforms, for example overlapping-basis-type transforms, are made to be applicable to, and compatible with, various data compression standards, thereby improving system performance.
In a first embodiment, the present invention is directed to an apparatus and method for pre-processing input data in a data compression system employing a fixed transform. The input data is modified by a cancellation transform which is the substantial inverse of the fixed transform to generate pre-processed data. The pre-processed data are applied to the fixed transform to generate compressed data, the compressed data being substantially unaffected by the fixed transform.
In a preferred embodiment, the present invention further comprises modifying the input data by an alternative transform, for example a Lapped Orthogonal Transform, a Generalized Lapped Orthogonal Transform, a Wavelet Transform, a Wavelet-Packed Transform, or other artifact-reduction transform. The compressed data may be decompressed by applying a decompression transform which is the substantial inverse of the fixed transform. The decompressed data may be further applied to the fixed transform, to generate transformed decompressed data which is substantially unaffected by the fixed transform. The fixed transform may comprise a Discrete Cosine-Based Transform (DCT). The transformed decompressed data may be applied to an inverse of the alternative transform to generate output data.
The input and output data may comprise a variety of signals, for example image data, video data, audio data, and multidimensional data. For multidimensional data, the pre- and post-processing techniques are preferably applied across some or all rows and columns, and across some or all dimensions. The fixed transform may be applied to a standard compression system, for example JPEG, MPEG-I, MPEG-II, H.261, H.263, H.263+, and H.324. The pre- and post-processing techniques of the present invention may be applied to all images of a video or audio sequence, or a subset th
Nguyen Truong Q.
Rosiene Joel A.
Bayat Ali
Mehta Bhavesh
Mills & Onello LLP
LandOfFree
Enhanced transform compatibility for standardized data... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Enhanced transform compatibility for standardized data..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Enhanced transform compatibility for standardized data... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2853876