Pulse or digital communications – Bandwidth reduction or expansion – Television or motion video signal
Reexamination Certificate
2001-05-15
2004-09-21
Kelley, Chris (Department: 2613)
Pulse or digital communications
Bandwidth reduction or expansion
Television or motion video signal
Reexamination Certificate
active
06795505
ABSTRACT:
FIELD OF THE INVENTION
The present invention first relates to an encoding method for the compression of a video sequence including successive frames organized in groups of frames, each frame being decomposed by means of a three-dimensional (3D) wavelet transform leading to a given number of successive resolution levels, said encoding method being based on the hierarchical subband encoding process called “set partitioning in hierarchical trees” (SPIHT) and leading from the original set of picture elements (pixels) of each group of frames to wavelet transform coefficients encoded with a binary format and constituting a hierarchical pyramid, said coefficients (i,j) being organized into a spatio-temporal orientation tree rooted in the lowest frequency (or approximation subband) resulting from the 3D wavelet transform and completed by an offspring in the higher frequency subbands, the coefficients of said tree being ordered into partitioning sets involving the pixels and corresponding to respective levels of significance, said sets being defined by means of magnitude tests leading to a classification of the significance information in three ordered lists called list of insignificant sets (LIS), list of insignificant pixels (LIP) and list of significant pixels (LSP), said tests being carried out in order to divide said original set of pixels into said partitioning sets according to a division process that continues until each significant coefficient is encoded within said binary representation, and said spatio-temporal orientation tree defining the spatio-temporal relationship inside said hierarchical pyramid.
The invention also relates to an encoding method for the compression of a video sequence including successive frames organized in groups of frames, each frame being decomposed by means of a three-dimensional (3D) wavelet transform leading to a given number of successive resolution levels, said encoding method being based on a hierarchical subband encoding process leading from the original set of picture elements (pixels) of each group of frames to wavelet transform coefficients constituting a hierarchical pyramid, a spatio-temporal orientation tree-in which the roots are formed with the pixels of the approximation subband resulting from the 3D wavelet transform and the offspring of each of these pixels is formed with the pixels of the higher subbands corresponding to the image volume defined by these root pixels—defining the spatio-temporal relationship inside said hierarchical pyramid, said subbands being scanned one after the other in an order that respects the parent-offspring dependencies formed in said spatio-temporal tree, and flags “off/on” being added to each coefficient (i,j) of said spatio-temporal tree in view of a progressive transmission of the most significant bits of the coefficients and in such a manner that at least one of them describes the state of a set of pixels and at least another one describes the state of a single pixel.
BACKGROUND OF THE INVENTION
The expansion of multimedia applications is now making the scalability one of the most important functionalities of video compression schemes. Scalability allows delivering multiple levels of quality or spatial resolutions/frame rates in an embedded bitstream towards receivers with different requirements and encoding capabilities. Current standards like MPEG-4 have implemented scalability in a predictive DCT-based framework through additional high-cost layers. More efficient solutions based on a three-dimensional wavelet decomposition followed by a hierarchical encoding of the spatio-temporal trees like the Set Partitioning In Hierarchical Trees algorithm (SPIHT) have been recently proposed as an extension of still image coding techniques (the original SPIHT algorithm is described for instance in “A new, fast, and efficient image codec based on set partitioning in hierarchical trees”, by A. Said and W. A. Pearlman, IEEE Transactions on Circuits and Systems for Video Technology, vol. 6, n°3, June 1996, pp. 243-250, and the extension of this algorithm to the 3D case is described for instance in “An embedded wavelet video coder using three-dimensional set partitioning in hierarchical trees (SPIHT)”, B. J. Kim and W. A. Pearlman, Proceedings of Data Compression Conference, Mar. 25-27, 1997, Snowbird, Utah, USA, pp.251-260). The 3D wavelet decomposition provides a natural spatial resolution and frame rate scalability, while the in-depth scanning of the obtained coefficients in the hierarchical trees and the bitplane encoding lead to the desired quality scalability with a high compression ratio.
The SPIHT algorithm is based on a key concept: a partial sorting of the coefficients according to a decreasing magnitude, and the prediction of the absence of significant information across scales of the wavelet decomposition by exploiting self-similarity inherent in natural images. This means that if a coefficient is insignificant at the lowest scale of the wavelet decomposition, the coefficients corresponding to the same area at the other scales have a high probability to be insignificant too. Basically, the SPIHT is an iterative algorithm that consists in comparing a set of pixels corresponding to the same image area at different resolutions with a value called “level of significance” from the maximal significance level found in the spatio-temporal decomposition tree down to 0. For a given level, or bitplane, two passes are carried out: the sorting pass, which looks for zero-trees or sub-trees and sorts insignificant and significant coefficients, and the refinement pass, which sends the precision bits of the significant coefficients. The SPIHT algorithm examines the wavelet coefficients from the highest level of the decomposition to the lowest one. This corresponds to first considering the coefficients corresponding to important details located in the smallest scale subbands, with increasing resolution, then examining the smallest coefficients, which correspond to fine details. This justifies the “hierarchical” designation of the algorithm: the bits are sent by decreasing importance of the details they represent, and a progressive bitstream is thus formed.
A tree structure, called spatial (or spatio-temporal in the 3D case) orientation tree, defines the spatial (or spatio-temporal) relationship inside the hierarchical pyramid of wavelet coefficients. The roots of the trees are formed with the pixels of the approximation subband at the lowest resolution (“root” subband), while the pixels of the higher subbands corresponding to the image area (to the image volume, in the 3D case) defined by the root pixel form the offspring of this pixel. In the 3D version of the SPIHT algorithm, each pixel of any subband but the leaves has 8 offspring pixels, and each pixel has only one parent. There is one exception at this rule: in the root case, one pixel out of 8 has no offspring. The following notations describe the parent-offspring relationship, an illustration of these dependencies being given in
FIG. 1
(three-dimensional case) where the notations are the following: TF=temporal frame, TAS=temporal approximation subband, CFTS=coefficients in the spatio-temporal approximation subbands (or root coefficients), TDS.LRL=temporal detail subband at the last resolution level of the decomposition, and TDS.HR=temporal detail subband at higher resolution:
O(x,y,z): set of coordinates of the direct offspring of the node (x,y,z);
D(x,y,z): set of coordinates of all descendants of the node (x,y,z);
H(x,y,z): set of coordinates of all spatio-temporal orientation tree roots (nodes in the highest pyramid level: spatio-temporal approximation subband);
L
(
x,y,z
)=
D
(
x,y,z
)−
O
(
x,y,z
).
The SPIHT algorithm makes use of three lists: the LIS (list of insignificant sets), the LIP (list of insignificant pixels), and the LSP (list of significant pixels). In all these lists, each entry is identified by a coordinate (x,y,z). In the LIP and LIS, (x,y,z) represents a unique coefficient, while in the LIS i
Kelley Chris
Koninklijke Philips Electronics , N.V.
Senfi Behrooz
LandOfFree
Encoding method for the compression of a video sequence does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Encoding method for the compression of a video sequence, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Encoding method for the compression of a video sequence will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3254770