Three-dimensional shape-adaptive wavelet transform for...

Pulse or digital communications – Bandwidth reduction or expansion – Television or motion video signal

Reexamination Certificate

Rate now

[ 0.00 ] – not rated yet Voters 0 Comments 0

Details Three-dimensional shape-adaptive wavelet transform for... Three-dimensional shape-adaptive wavelet transform for...

: 2000-06-20
: 2003-07-22
: Britton, Howard (Department: 2613)
: Pulse or digital communications
: Bandwidth reduction or expansion
: Television or motion video signal

: Reexamination Certificate
: active
: 06597739
: ABSTRACT:

TECHNICAL FIELD
This invention relates to computers, and more particularly to improved methods and arrangements for coding/decoding object data using three-dimensional (3D) shape-adaptive discrete wavelet transform (SA-DWT) techniques.
BACKGROUND
Technology is transforming the modem economy and is also having a tremendous impact on the standard of living for ordinary people. For example, video conferencing is facilitating communication and is enabling businesses to conduct business over great distances more efficiently. The Internet is also transforming the way in which both companies and people conduct business. In particular, the Internet has increased communication between people and has provided extraordinary amounts of information at one's fingertips.
Not only is technology transforming the economy, but it is also increasing the standard of living for ordinary people. For example, technology has changed the way in which people are entertained. Computer technology and video technology has enabled much more realistic and advanced video games. It has also improved the technical quality of movies and other video technology, and has made them more accessible to people.
Video processing is critical to all of these technologies. Video processing is the handling and manipulation of a video signal in order to achieve certain results including displaying an image on a monitor, compressing the signal for efficient storage or transmission, and manipulation of the image.
Recently, there has been a move away from frame-based coding towards object-based coding of image data. In object-based coding, a typical scene will include a plurality of visual objects that are definable in such a way that their associated image data (e.g., shape, motion and texture information) can be specially processed in a manner that further enhances the compression and/or subsequent rendering processes. Thus, for example, a person, a hand, or an automobile may be individually coded as an object. Note that, as used herein, objects may include any type of video displayable image, such as actual captured images, virtually generated images, text, etc.
Moving Picture Experts Group (MPEG) is the name of a family of standards used for coding audio-visual information (e.g., movies, video, music, etc.) in a digital compressed format. One advantage of MPEG compared to other video and audio coding formats is that MPEG files are much smaller for the same quality. This is because MPEG employs compression techniques to code frames, or as is the case in MPEG-4 to code objects as separate frame layers.
In MPEG there are three types of coded frame layers. The first type is an “I” or intra frame, which is a frame coded as a still image without using any past history. The second type is a “P” or Predicted frame, which is predicted from the most recent I frame or P frame. Each macroblock of data in a P frame can either come with a vector and difference discrete cosine transform (DCT) coefficients for a close match in the last I or P, or it can be “intra” coded (e.g., as in the I frames). The third type is a “B” or bi-directional frame, which is predicted from the closest two I frames or P frames, e.g., one in the past and one in the future. For example, a sequence of frames may be of the form, . . . IBBPBBPBBPBBIBBPBBPB . . . , which contains 12 frames from I frame to I frame. Additionally, enhancement I, P, or B frame layers may be provided to add additional refinement/detail to the image. These and other features of the MPEG standard are well known.
MPEG-4 provides the capability to further define a scene as including one or more objects. Each of these objects is encoded into a corresponding elementary data bitstream using I, P, B, and enhancement frame layers. In this manner, MPEG-4 and other similarly arranged standards can be dynamically scaled up or down, as required, for example, by selectively transmitting elementary bitstreams to provide the necessary multimedia information to a client device/application.
Unfortunately, the DCT coding scheme employed in MPEG-4 provides only limited scalability with respect to both the spatial and temporal domains. In other words, the DCT coding scheme has limited capabilities for either compressing or enlarging an image and limited capabilities for making a video run faster or slower.
More recently, DCT coding schemes are being replaced with discrete Wavelet transform (DWT) coding schemes. DWT coding takes advantage of both the spatial and the frequency correlation that exist in the image data to provide even better compression of the image data.
For a two-dimensional image array (i.e., a frame layer), image data compression using DWTs usually begins by decomposing or transforming the image into four subbands or subimages. Each subimage is one-fourth the size of the original image, and contains one-fourth as many data points as the original image. The image decomposition involves first performing a one-dimensional wavelet convolution on each horizontal pixel column of the original image, thereby dividing the image into two subimages containing low frequency and high frequency information respectively. The same or a similar convolution is then applied to each vertical pixel row of each subimage, dividing each of the previously obtained subimages into two further subimages which again correspond to low and high frequency image information.
The resulting four subimages are typically referred to as LL, LH, HL, and HH subimages. The LL subimage is the one containing low frequency information from both the vertical and horizontal wavelet convolutions. The LH subimage is the one containing low frequency image information from the horizontal wavelet convolution and high frequency image information from the vertical wavelet convolution. The HL subimage is the one containing high frequency information from the horizontal wavelet convolution and low frequency image information from the vertical wavelet convolution. The HH subimage is the one containing high frequency information from both the vertical and horizontal wavelet convolutions.
The wavelet transforms described above can be performed recursively on each successively obtained LL subimage. For the practical purposes, it has generally been found that calculating four or five decomposition levels is sufficient for most situations.
To reconstruct the original image, the inverse wavelet transform is performed recursively at each decomposition level. For example, assuming a two-level compression scheme, the second decomposition level would include a subimage LL2 that is a low resolution or base representation of the original image. To obtain a higher resolution, a subimage LL1 is reconstructed by performing an inverse wavelet transform using the subimages of the second decomposition level. The original image, at the highest available resolution, can subsequently be obtained by performing the inverse transform using the subimages of the first decomposition level (but only after obtaining subimage LL1 through an inverse transform of the second decomposition level).
The attractiveness of the wavelet approach to image compression and transmission is that subimages LH, HL, and HH contain data that can be efficiently compressed to very high compression ratios through such methods as zero-tree and arithmetic encoding.
Unfortunately, current DWT techniques also suffer from certain limitations. This is especially true for object-based coding. For example, current DWT techniques require that objects, regardless of their shape, be isolated in a bounding box (e.g., a rectangle, etc.). As such, the resulting object-based coding data will include non-object information. Since encoding the non-object information is redundant, it will require additional bits to encode it. In addition, the non-object information will likely be significantly different than the object, so the correlation for pixels located in a row or column of the bounding box will likely be reduced. Consequently, the amount of object-based coding data will likely be greater. Therefore, such

Affiliated with

Li Shipeng

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Zhang Ya-Qin

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Also associated with

Britton Howard

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

Lee & Hayes PLLC

Law Firm

[ 0.00 ] – not rated yet Voters 0 Comments 0

Microsoft Corporation

Corporate Assignee

[ 0.00 ] – not rated yet Voters 0 Comments 0

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Three-dimensional shape-adaptive wavelet transform for... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Three-dimensional shape-adaptive wavelet transform for..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Three-dimensional shape-adaptive wavelet transform for... will most certainly appreciate the feedback.

Rate now

Comments { 0 }

Profile ID: LFUS-PAI-O-3069513

All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.

Canada

Charities
Companies
MP Candidates
Patents
Employee Salary Disclosure

World

Places of the World
Scientific Papers

United States

Banks
Companies
Counties
Patents
Employee Salary Disclosure