Video encoding method using a wavelet transform

Image analysis – Image compression or coding – Pyramid – hierarchy – or tree structure

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C375S240190

Reexamination Certificate

active

06735342

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to an encoding method for the compression of a video sequence divided, in frames or groups of frames decomposed by means of a wavelet transform leading to a given number of successive resolution levels that correspond to the decomposition levels of said transform, said encoding method being based on the hierarchical subband encoding process called “set partitioning in hierarchical trees” (SPIHT) and leading from the original set of picture elements of the video sequence to wavelet transform coefficients encoded with a binary format, said coefficients being organized in trees and ordered into partitioning sets corresponding to respective levels of significance, said sets being defined by means of magnitude tests leading to a classification of the significance information in three ordered lists called list of insignificant sets (LIS), list of insignificant pixels (LIP) and list of significant pixels (LSP), and said tests being carried out in order to divide said original set of picture elements into said partitioning sets according to a division process that continues until each significant coefficient is encoded within said binary representation.
2. Description of the Related Art
Classical video compression schemes may be considered as comprising four main modules: motion estimation and compensation, transformation in coefficients (for instance, discrete cosine transform or wavelet decomposition), quantification and encoding of the coefficients, and entropy coding. When a video encoder has moreover to be scalable, this means that it must be able to encode images from low to high bit rates, increasing the quality of the video with the rate. By naturally providing a hierarchical representation of images, a transform by means of a wavelet decomposition appears to be more adapted to scalable schemes than the conventional discrete cosine transform (DCT).
A wavelet decomposition allows an original input signal to be described by a set of subband signals. Each subband represents in fact the original signal at a given resolution level and in a particular frequency range. This decomposition into uncorrelated subbands is generally implemented by means of a set of monodimensional filter banks applied first to the lines of the current image and then to the columns of the resulting filtered image. An example of such an implementation is described in “Displacements in wavelet decomposition of images”, by S. S. Goh, Signal Processing, vol. 44, n°1, June 1995, pp.27-38. Practically two filters—a low-pass one and a high-pass one—are used to separate low and high frequencies of the image. This operation is first carried out on the lines and followed by a sub-sampling operation, by a factor of 2, and then carried out on the columns of the sub-sampled image, the resulting image being also down-sampled by 2. Four images, four times smaller than the original one, are thus obtained: a low-frequency sub-image (or “smoothed image”), which includes the major part of the initial content of the concerned original image and therefore represents an approximation of said image, and three high-frequency sub-images, which contain only horizontal, vertical and diagonal details of said original image. This decomposition process continues until it is clear that there is no more useful information to be derived from the last smoothed image.
A technique rather computationally simple for image compression, using a two-dimensional (2D) wavelet decomposition, is described in “A new, fast, and efficient image codec based on set partitioning in hierarchical trees (=SPIHT)”, by A. Said and W. A. Pearlman, IEEE Transactions on Circuits and Systems for Video Technology, vol.6, n°3, June 1996, pp.243-250, As explained in said document, the original image is supposed to be defined by a set of pixel values p(x,y), where x and y are the pixel coordinates, and coded by a hierarchical subband transformation, represented by the following formula (1):
c
(
x,y
)=&OHgr;(
p
(
x,y
))  (1)
where &OHgr; represents the transformation and each element c(x,y) is called “transform coefficient for the pixel coordinates (x,y)”.
The major objective is then to select the most important information to be transmitted first, which leads to order these transform coefficients according to their magnitude (coefficients; with larger magnitude have a larger content of information and should be transmitted first, or at least their most significant bits). If the ordering information is explicitly transmitted to the decoder, images with a rather good quality can be recovered as soon as a relatively small fraction of the pixel coordinates are transmitted. If the ordering information is not explicitly transmitted, it is then supposed that the execution path of the coding algorithm is defined by the results of comparisons on its branching points, and that the decoder, having the same sorting algorithm, can duplicate this execution path of the encoder if it receives the results of the magnitude comparisons. The ordering information can then be recovered from the execution path.
One important fact in said sorting algorithm is that it is not necessary to sort all coefficients, but only the coefficients such that 2
n
≦|c
x,y
|<2
n+1
, with n decremented in each pass. Given n, if |c
x,y
|≧2
n
(n being called the level of significance), it is said that a coefficient is significant; otherwise it is called insignificant. The sorting algorithm divides the set of pixels into partitioning subsets T
m
and performs the magnitude test (2):
max
(
x
,
y
)

T
m

{
&LeftBracketingBar;
c
x
,
y
&RightBracketingBar;
}

2
n


?
(
2
)
If the decoder receives a “no” (the whole concerned subset is insignificant), then it knows that all coefficients in this subset T
m
are insignificant. If the answer is “yes” (the subset is significant), then a predetermined rule shared by the encoder and the decoder is used to partition T
m
into new subsets T
m,l
, the significance test being further applied to these new subsets. This set division process continues until the magnitude test is done to all single coordinate significant subsets in order to identify each significant coefficient and to allow to encode it with a binary format.
To reduce the number of transmitted magnitude comparisons (i.e. of message bits), one may define a set partitioning rule that uses an expected ordering in the hierarchy defined by the subband pyramid. The objective is to create new partitions such that subsets expected to be insignificant contain a large number of elements, and subsets expected to be significant contain only one element. To make clear the relationship between magnitude comparisons and message bits, the following function is used:
S
n

(
T
)
=
{
1
,
max
(
x
,
y
)

T

{
&LeftBracketingBar;
c
x
,
y
&RightBracketingBar;
}

2
n
,
0
,
otherwise
,
(
3
)
to indicate the significance of a subset of coordinates T.
Furthermore, it has been observed that there is a spatial self-similarity between subbands. The coefficients are expected to be better magnitude-ordered if one moves downward in the pyramid following the same spatial orientation. For instance, if low-activity areas are expected to be identified in the highest levels of the pyramid, then they are replicated in the lower levels at the same spatial locations, but with a higher resolution. A tree structure, called spatial orientation tree, naturally defines the spatial relationship in the hierarchical subband pyramid of the wavelet decomposition.
FIG. 1
shows how the spatial orientation tree is defined in a pyramid constructed with recursive four-subband splitting. Each node of the tree corresponds to the pixels of the same spatial orientation in the way that each node has either no offspring (the leaves) or four offspring, which always form a group of 2×2 adjacent pixels. In
FIG. 1
, the arrows are oriented from the paren

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Video encoding method using a wavelet transform does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Video encoding method using a wavelet transform, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Video encoding method using a wavelet transform will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3186051

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.