Image analysis – Image compression or coding – Pyramid – hierarchy – or tree structure
Reexamination Certificate
2002-03-01
2004-07-06
Johns, Andrew W. (Department: 2621)
Image analysis
Image compression or coding
Pyramid, hierarchy, or tree structure
C382S248000
Reexamination Certificate
active
06760482
ABSTRACT:
FIELD OF THE INVENTION
The present invention relates to the lossy compression of still images and, more particularly, to an improved method for assigning bits to different spatial and frequency portions of the compressed image so as to maximise perceived visual quality.
BACKGROUND OF THE INVENTION
Conventional image compression systems, such as that represented by the well-known baseline JPEG standard, suffer from a number of problems of which the following three are notable.
1) They are unable to exploit visual masking and other properties of the Human Visual System (HVS) which vary spatially with image content. This is because the quantization parameters used by these algorithms are constant over the extent of the image. As a result, images are unable to be compressed as efficiently as might be expected if visual masking were taken into account.
2) To achieve a target bit-rate or visual quality, the image must be compressed multiple times, while varying one or more of the quantization parameters in an iterative fashion. This is known as the rate-control problem and it enters into many practical image compression applications, including the compression of digital camera images and page compression to save memory within printers, scanners and other such peripheral devices.
3) The target bit-rate and desired viewing resolution must be known prior to compression. By contrast, for many applications, a scalable bit-stream is highly desirable. A scalable bit-stream is one which may be partially transmitted or decompressed so as to reconstruct the image with lower quality or at a lower resolution, such that the quality of the reconstructed image is comparable to that which would have been achieved if the relevant bit-rate and resolution were known when the image was compressed. Obviously, this is a desirable property for compressed image databases, which must allow remote clients access to the image at the resolution and bit-rate (i.e. download time) of their choice. Scalability is also a key requirement for robust transmission of images over noisy channels. The simplest and most commonly understood example of a scalable bit-stream is a so-called “progressive” bit-stream. A progressive bit-stream has the property that it can be truncated to any length and the quality of the reconstructed image should be comparable to that which could have been achieved if the image had been compressed to the truncated bit-rate from the outset. Scalable image compression clearly represents one way of achieving non-iterative rate-control and so addresses the concerns of item 2) above.
A number of solutions have been proposed to each of these problems. The APIC image compression system (Höntsch and Karam, “APIC: Adaptive Perceptual Image Coding Based on Sub-band Decomposition with Locally Adaptive Perceptual Weighting,”
International Conference on Image Processing
, vol. 1, pp. 37-40, 1997) exploits visual masking in the Wavelet transform domain, through the use of an adaptive quantizer, which is driven by the causal neighbourhood of the sample being quantized, consisting of samples from the same sub-band. The approach has a number of drawbacks: it is inherently not scalable; iterative rate-control is required; and the masking effect must be estimated from a causal neighbourhood of the sample being quantized, in place of a symmetric neighbourhood which would model the HVS more accurately. On the other hand, a variety of solutions have been proposed to the second and third problems. Some of the more relevant examples are the SPIHT (A. Said and W. Pearlman, “A New, Fast and Efficient Image Codec based on Set Partitioning in Hierarchical Trees,”
IEEE Trans. on Circuits and Systems for Video Technology
, vol. 6, no. 3, pp. 243-250, June 1996) and EBCOT (D. Taubman, “EBCOT: Embedded Block Coding with Optimised Truncation,” ISO/IEC JTC 1/SC 29/WG1 N1020R, Oct. 21, 1998.) image compression methods. These both produce highly scalable bit-streams and directly address the rate-control problem; however, they focus on minimising Mean Squared Error (MSE) between the original and reconstructed images, rather than minimising visual distortion. Some attempts have been made to exploit properties of the HVS within the context of SPIHT and other scalable compression frameworks; however, these approaches focus on spatially uniform properties such as the Contrast Sensitivity Function (CSF), and are unable to adapt spatially to exploit the important phenomenon of visual masking. The compression system proposed by Mazzarri and Leonardi (A. Mazzarri and R. Leonardi, “Perceptual Embedded Image Coding using Wavelet Transforms,”
International Conference on Image Processing
, vol. 1, pp. 586-589, 1995.) is an example of this approach. Also worth mentioning here is the method proposed by Watson (A B Watson, “DCT Quantization Matrices Visually Optimized for Individual Images,”
Proceedings of the SPIE
, vol. 1913, pp. 202-216, 1993.) for optimising quantization tables in the baseline JPEG image compression system. Although this method is restricted to space invariant quantization parameters and non-scalable compression, by virtue of its reliance on the baseline JPEG compression standard, it does take visual masking and other properties of the HVS into account in designing a global set of quantization parameters. The visual models used in the current invention are closely related to those used by Watson and those used in APIC.
Embedded Block Coding
Embedded block coding is a method of partitioning samples from the frequency bands of a space frequency representation of the image into a series of smaller blocks and coding the blocks such that the bit stream in each block can be truncated at a length selected to provide a particular distortion level. To achieve embedded block coding, the image is first decomposed into a set of distinct frequency bands using a Wavelet transform, Wavelet packet transform, Discrete Cosine Transform, or any number of other space-frequency transforms which will be familiar to those skilled in the art. The basic idea is to further partition the samples in each band into smaller blocks, which we will denote by the symbols, B
1
,B
2
,B
3
, . . . . The particular band to which each of these blocks belongs is immaterial to the current discussion. The samples in each block are then coded independently, generating a progressive bit-stream for each block. B
i
, which can be truncated to any of a set of distinct lengths, R
i
1
,R
i
2
, . . . ,R
i
N
i
, prior to decoding. Efficient block coding engines, which are able to produce a finely gradated set of truncation points, R
i
n
, such that each truncated bit-stream represents an efficient coding of the small independent block of samples, B
i
, have been introduced only recently as part of the EBCOT image compression system. A discussion of the techniques involved in generating such embedded block bit-streams is inappropriate and unnecessary here, since the present invention does not rely upon the specific mechanism used to code each block of samples, but only upon the existence of an efficient, fine embedding, for independently coded blocks of samples from each frequency band.
The motivation for considering embedded block coding is that each block may be independently truncated to any desired length in order to optimise the trade-off between the size of the overall compressed bit-stream representing the image and the distortion associated with the image which can be reconstructed from this bit-stream. In the simplest incarnation of the idea, each block bit-stream is truncated to one of the available lengths, R
i
n
, in whatever manner is deemed most appropriate, after which the truncated bit-streams are concatenated in some pre-determined order, including sufficient auxiliary information to identify the truncation point, n
i
, and length, R
i
n
i
, associated with each block. Evidently, this provides an elegant solution to the rate-control problem described above. In more sophisticated incarnations of the idea, the overall compressed bit-stream might be organised
Dority & Manning
Johns Andrew W.
Unisearch Limited
LandOfFree
Method for visual optimisation of embedded block codes to... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method for visual optimisation of embedded block codes to..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method for visual optimisation of embedded block codes to... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3198080