Method for reducing data expansion during data compression

Coded data generation or conversion – Digital code to digital code converters – Unnecessary data suppression

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C341S051000

Reexamination Certificate

active

06271775

ABSTRACT:

FIELD OF THE INVENTION
The present invention relates to data compression schemes, and more particularly to a method for reducing the expansion of data during data compression.
BACKGROUND OF THE INVENTION
The use of data compression or “coding” schemes to increase the storage capacity of storage media (e.g., tape drives, hard drives, etc.) is well known in the art, and can result in significant increases in data storage capacity. However, the efficiency with which data may be compressed depends on the specifics of the compression scheme employed and the type of data compressed. Depending on data entropy, certain data types may be incompressible or inefficiently compressible by the compression scheme, and may cause the data to occupy more memory space than when the data is in an uncompressed format (i.e., data expansion). For example, in many implementations of Lempel-Ziv 1 coding including IBM's adaptive lossless data compression (ALDC), LZS (QIC 122), etc., highly random data can expand in size up to 12.5% (e.g., from 60,000 bytes uncompressed to 67,500 bytes compressed).
When data expansion occurs during data compression, the very purpose of performing data compression (e.g., to increase the storage capacity of a storage media) is subverted. Accordingly, a need exists for reducing the expansion of data during data compression.
SUMMARY OF THE INVENTION
To overcome the needs of the prior art, a method of reducing data expansion during data compression is provided that determines when the coding scheme used to compress data should be swapped between two or more coding schemes. Specifically, a coding window is provided that allows analysis of the compression potential of data therewithin. The data within the coding window is analyzed to determine the compression potential of the data. If the compression potential of the data reaches a first predetermined value, the coding scheme used to compress the data within the coding window is swapped from one coding scheme to another (e.g., the coding scheme used to compress the data within the coding window is swapped to a new coding scheme and the data within the coding window is then compressed using the new coding scheme). As used herein, “reaches a predetermined value” means has an absolute magnitude greater than or equal to an absolute magnitude of the predetermined value. Preferably the first predetermined value is programmable and is related to the bit cost required to swap back and forth between coding schemes. The two preferred coding schemes are ALDC Lempel-Ziv 1 (hereinafter “LZ1”) coding and a pass-through (hereinafter “RAW”) coding scheme wherein raw data is passed unencoded.
Analysis of the compression potential of data within the coding window may be performed by many techniques, but preferably comprises computing a compression potential sum S
p
for p data bytes within the coding window according to the formula:
S
p
=

n
=
1
p



f

(
W

[
n
]
)
where ƒ(W[n]) equals the compression potential of the nth data byte within the coding window. Swapping the coding scheme used to compress the data within the coding window from one scheme to another is performed if the compression potential sum S
p
reaches the first predetermined value.
If the compression potential for each data byte within the coding window is analyzed (or if a partition boundary is reached for the data being compressed) before the first predetermined value is reached, swapping of the coding scheme used to compress the data within the coding window between coding schemes preferably is performed if the compression potential sum S
p
reaches a second predetermined value.
A computer program product for use in a data compression process having two or more coding schemes also is provided. The inventive program product is carried by a medium readable by a computer (e.g., a carrier wave signal, a floppy disc, a hard drive, a random access memory, etc.). The computer readable medium comprises means for providing a coding window that allows analysis of the compression potential of data therewithin, means for analyzing the data within the window and means for swapping the coding scheme used to compress the data within the window from one scheme to another if the potential for compression reaches a predetermined value.
By thus analyzing the compression potential of data bytes prior to coding, and by selecting an appropriate coding scheme based thereon, data compression may be performed with the potential for minimal data expansion. Other objects, features and advantages of the present invention will become more fully apparent from the following detailed description of the preferred embodiments, the appended claims and the accompanying drawings.


REFERENCES:
patent: 3394352 (1968-07-01), Wernikoff et al.
patent: 4870415 (1989-09-01), Van Maren et al.
patent: 5049881 (1991-09-01), Gibson et al.
patent: 5177480 (1993-01-01), Clark
patent: 5353024 (1994-10-01), Graybill
patent: 5686912 (1997-11-01), Clark, II et al.
patent: 6008743 (1999-12-01), Jacquette

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method for reducing data expansion during data compression does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method for reducing data expansion during data compression, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method for reducing data expansion during data compression will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2450721

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.