Coded data generation or conversion – Digital code to digital code converters
Reexamination Certificate
2003-11-04
2004-12-14
Tokar, Michael (Department: 2819)
Coded data generation or conversion
Digital code to digital code converters
C341S055000, C341S095000
Reexamination Certificate
active
06831575
ABSTRACT:
REFERENCE TO A COMPUTER PROGRAM
Not applicable.
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention pertains generally to a word aligned compression method for arbitrary length and content bitmaps, and more particularly to methods of logical operations on such word aligned compressed bitmaps.
2. Description of the Relevant Art
U.S. Pat. No. 5,363,098, incorporated herein by reference, discloses a method for compressing, decompressing, and logically manipulating arbitrary bitmaps. The method uses what has been found to be a relatively computationally expensive byte boundary compression (BBC) method that produces very good compression ratios. However, due to the byte boundaries used, the BBC method uses significantly more computational effort than a method based upon the compression method using word alignment strategies, as described below.
Many commercial and scientific applications need to search and otherwise query large amounts of data. When the data is modified infrequently, as is the case in general databases, image databases, statistical databases, data warehouses, and large datasets produced by scientific experiments and simulations (generally referred to as write once—read many datasets), one of the faster indexing strategies is the bitmap index. Bitmap indexing methods are frequently the most efficient scheme for ad hoc searches on large datasets, a type of operation common in on-line analytical processing (OLAP) and business intelligence systems.
To reduce the large sizes of bitmap indices, the bitmaps are compressed. However, general-purpose compression schemes are not efficient for database query operations because such operations on the compressed bitmaps are too slow. A number of specialized schemes have been devised to address the lack of speed issue; such as the Byte-aligned Bitmap Code (BBC mentioned above). BBC has generally been considered to be most efficient of all known compression methods.
The new bitmap compression technique described herein uses a bitmap compression data structure and associated computer programmed methods tailored to use the compressed data structure. The bitmap compression method described herein, called Word-Aligned Hybrid (WAH) bitmap compression, increases the execution speed of operations by utilizing the compressed bitmaps. To achieve this increased speed, a compression method has been designed to store data in a way that requires less central processor unit (CPU) effort to compress and decompress bitmaps. Inventor tests have shown that on specific real and synthetic data sets, on the average, the WAH method may be as much as ten times faster than the BBC compression scheme when performing logical operations between compressed bitmaps. The specific data sets were selected to represent a wide spectrum of scientific and industrial applications.
Bitmap indices have previously been generally regarded as efficient for low cardinality attributes, i.e., those attributes with only a small number of different distinct values. Analyses and tests conducted by the inventors have demonstrated that with WAH compression the bitmap indices may also be efficient for high cardinality attributes. Thus the compressed bitmap index may now be used effectively on many different attributes.
BRIEF SUMMARY OF THE INVENTION
This invention provides for a bitmap compression data structure located in a computer readable medium comprising: a) a compressed bitmap comprised of words that have a length of Word Length (WL) bits, wherein said compressed bitmap represents sequentially corresponding bits in an uncompressed bitmap; b) an activeWord; c) an activeWord bit length, nbits, comprising: i) a series of remainder bits, numbering less than WL minus two, stored in the activeWord, (1) the series of remainder bits forming a remainder after the uncompressed bitmap has been parsed into contiguous groupings of WL minus one bits, (2) the nbits in the activeWord represents the last contiguous grouping of nbits bits in the uncompressed bitmap; and d) each of the words of the compressed bitmap having a literal/fill bit, wherein: i) in each of the words having the literal/fill bit that represents a set state, the compressed bitmap word further comprises: (1) a fill bit value having a binary state, and (2) a fill length, wherein the compressed bitmap word represents a contiguous grouping of (fill length*(WL minus one)) bits having the fill bit value state in the corresponding uncompressed bitmap; ii) in each of the words having the literal/fill bit that represents a not set state, a literal word represents a contiguous grouping of WL minus one bits of the corresponding uncompressed bitmap.
In an alternative embodiment, the bitmap compression data structure described above may have said compressed bitmap represent either the literal or a binary complement of the literal uncompressed bitmap.
In another embodiment, the bitmap compression data structure described in the above paragraph may have said literal/fill bit, where: a) is comprised of one or more bits; b) whereby said literal/fill bit alternatively represents either the set state, or the not set state.
In another embodiment, the bitmap compression data structure described in the above paragraph may have: a) said literal word representing the contiguous grouping of WL minus one bits by either the literal, a binary complement of the literal uncompressed bitmap, or any isomorphic transformation.
In another embodiment, the bitmap compression data structure described in the above paragraph may have: a means for calculating, on a computer, the bitmap compression data structure of the corresponding uncompressed bitmap.
In another embodiment, this invention provides for a method of bitmap compression comprising: a) initializing a Word Length (WL), a compressed bitmap of WL bits, an activeWord, and an activeWord bit length; i) providing a literal/fill bit in each Word Length word of the compressed bitmap by: (1) if the literal/fill bit is set to indicate a literal value, then (a) setting in the remaining WL minus one bits a literal representation of an uncompressed bitmap; (2) if the literal/fill bit is set to indicate a fill value, then (a) setting, in a fill bit, a representation of the fill value, and (b) setting a plurality of the remaining WL minus two bits to contain a representation, k, of a number of fill values repeated k*(WL minus one) times in the uncompressed bitmap; and b) continuing compression until done.
In another embodiment, the method of Word Aligned Hybrid bitmap compression of the above paragraph may have said initializing step further comprising: a) setting the compressed bitmap to an empty state; b) setting the activeWord to an empty state; and c) setting the activeWord bit length to zero length.
In another embodiment, the method of bitmap compression of described in the above paragraph may have said continuing compression step further comprising: a) starting at the beginning of the uncompressed bitmap and the beginning of the compressed bitmap that results from compression; and b) looping until the uncompressed bitmap is entirely compressed, the looping step comprising: i) sequentially inputting the WL minus one bits of the uncompressed bitmap into the activeWord; (1) if the end of the uncompressed bitmap is reached at any bit, then (a) setting the activeWord bit length to the number of the uncompressed bitmap bits input; and (2) exiting; ii) if the first WL minus one bits of the activeWord are identical and a current word in the compressed bitmap has the literal/fill bit set to the fill value representing the bits of the activeWord, then; (1) incrementing the number of fills, k, in the current word by one; (2) setting the activeWord to an empty state; and (3) setting the activeWord bit length to zero length; iii) else; (1) appending a new word to the compressed bitmap, which becomes the current word; (2) setting the literal/fill bit of the current word to a representation of the literal state in the compressed bitmap; and (3) setting the literal value to a representation of the
Otoo Ekow
Shoshani Arie
Wu Kesheng
Milner Joseph R.
Nguyen Khai M.
The Regents of the University of California
Tokar Michael
LandOfFree
Word aligned bitmap compression method, data structure, and... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Word aligned bitmap compression method, data structure, and..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Word aligned bitmap compression method, data structure, and... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3273060