Method, apparatus, computer program and storage medium for...

Coded data generation or conversion – Digital code to digital code converters – Adaptive coding

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C341S050000

Reexamination Certificate

active

06664903

ABSTRACT:

FIELD OF THE INVENTION
The present invention relates to a method and apparatus for reversible compression of digital data.
BACKGROUND OF THE INVENTION
For lossless (reversible) data compression methods, there are methods adopting LZ77/LZ78 invented by Ziv and Lempel as basics.
According to LZ77, a window buffer storing previous input data is looked up to find a longest-match string with current input data, and the position and length of the longest-match string found is outputted as coded data, thereby realizing data compression. This method is disclosed in U.S. Pat. No. 4,054,951 (Jackson, et al.) and “A universal algorithm for sequential data compression” by Ziv, J. and Lempel, A., IEEE Transaction on Information Theory, Vol. 23, No. 3, pp. 337-343, May 1977.
Furthermore, according to LZ78, a dictionary generated based on previous input data is looked up to find a longest-match string with current input data, and a code stored in correspondence with the data string found is outputted. Furthermore, a new data string, generated by linking the longest-match string found with the next character, is additionally registered in the dictionary to facilitate a longer match in the next search. This method is disclosed in U.S. Pat. Nos. 4,464,650, 4,558,302, and “Compression of individual sequences via variable-rate coding” by Ziv, J. and Lempel, A., IEEE Transaction on Information Theory, Vol. 24, No. 5, pp. 530-536, May 1978.
A 2-32 KB buffer is appropriate for the window buffer employed in LZ77. Since input data is compared with data stored in the window buffer, the range of the longest-match search is limited to 32 KB of the previous input data at most. On the contrary, in LZ78, the range of search is not limited to the nearest data as in LZ77. The range of the longest-match search in LZ78 can be expanded as far back the previous input data as the size of the dictionary, regardless of the size of the window buffer. Therefore, while LZ77 makes use of a near correlation, LZ78 makes use of a far correlation, thus has versatility.
Moreover, in LZ78, compression is realized by mere comparison between the current input data and dictionary. The construction of the dictionary can be made so as to better be adapted to the data search. Accordingly, the longest-match search can be performed at high speed.
However, in LZ78, data decompression must be performed while generating and updating the dictionary. On the contrary, in LZ77, since the position and offset of the data string in the window buffer are provided as code data, decompression can be performed without generating a dictionary. Therefore, decompression can be performed at higher speed than LZ78.
In other words, LZ78 has versatility with respect to data and is capable of high-speed data compression. LZ77 has poor data versatility, but is capable of high-speed decompression processing.
When compression is performed using LZ77, it is necessary to look up the window buffer storing previous input data to find a longest match with current input data. In order to increase the speed of this processing, there are hash search methods disclosed in U.S. Pat. No. 4,701,745 (J. R. Waterworth), U.S. Pat. No. 5,049,881 (D. K. Gibson), U.S. Pat. No. 5,051,745 (P. W. Katz), and RFC-1951 (“Deflate Compressed Data Format Specification version 1.3” by P. Deutsch). The hash search proposed by these known documents is described below.
FIG. 1
is an explanatory view of the hash search. Reference numeral
10
denotes a window buffer. The area on the left hand of the line P stores already-compressed previous input data before compression. The area on the right hand of the line P stores input data subjected to compression. Assume that the size of the left area of the line P is 32 KB. An offset, indicative of a position in the left area of the line P, increments as it goes toward the left, with the line P as an origin.
Reference numeral
11
denotes a hash array H[i], which stores an offset of the window buffer, and the number of elements is 2
15
=32768 entries. The length of the offset is 2 bytes.
FIG. 3
shows steps of compression processing. Description is provided according to this flowchart.
In step
301
, an initial value (head address of input data) is given to a pointer C indicative of a current input data string. In step
302
, H[i] is initialized to 0. Since an offset being 0 is improbable, this indicates that no data is stored in the offset. In step
303
, it is determined whether or not there is more input data to be compressed. If not, the control ends. If yes, the control proceeds to step
304
.
Reference numeral
12
in
FIG. 1
denotes a character string of current input data. Provided that the first three characters are expressed by an array C[0] to C[2], the hash value h is calculated by the method shown in
FIG. 2
(step
304
in FIG.
3
).
Note in
FIG. 2
, the reference letter {circumflex over ( )} indicates an exclusive OR. Other reference letters comply with the C language. “x<<y” indicates that x is shifted by y bit in the direction of higher bits. “x&y” indicates to AND x and y in units of bit. After calculating the hash value h, H[h] is compared with 0 in step
305
. If H[h] is 0, it indicates that a three-character string having the hash value h has not yet occurred. Then, in step
306
, a current offset of the current input data (offset of the first character) is stored in H[h]. Next in step
307
, data C[0] having 1 byte is outputted. In step
308
, the pointer C is incremented by 1 to enable processing of the next input data, and the control returns to step
303
.
If H[h] is not 0 in step
305
, it indicates that a three-character string having the same value as the calculated hash value h has occurred in the previous input data. The position m of the H[h] where the character string is located is extracted (step
309
). Then in step
310
, the current input data is compared with the previous input data located in the position m to obtain a longest-match length L. In step
311
, m and L are subjected to Huffman coding, and the coded data is outputted. After obtaining the longest-match length L, the pointer C is incremented by L in step
312
, and the control returns to step
303
.
In the compression processing of LZ77, the processing speed can be increased by employing the above-described hash search. However, the comparison between the current input data and previous input data is not expanded to the data inputted far back in the past beyond the window buffer, as in LZ78. Therefore, for instance, with regard to data repeated in a cycle of 32 KB that is the size of the window buffer, there is no effect of compression. Meanwhile according to the compression method of LZ78, since generating and updating the dictionary is necessary at the time of decompression, an overhead is generated. Thus, the decompression processing speed is slower than LZ77.
To increase the speed of decompression processing, it is preferable to employ code data, indicative of the offset and length, to extract a corresponding data string from the window buffer at the time of decompression. However, coding the offset and length of the data limits the target of longest-match search to the previous input data stored in the window buffer in compression processing. Therefore, it is difficult to have both ways: increasing decompression speed, and expanding the data search target in compression.
SUMMARY OF THE INVENTION
The present invention has been proposed in view of the conventional problems, and has as its object to provide a data compression method, apparatus, computer program, and storage medium, which can realize lossless data compression at high speed while taking advantage of high-speed decompression.
According to the present invention, the foregoing object is attained by providing a data compression method of reading input data from a predetermined input storage area, searching previous input data that matches the input data, generating coded data

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method, apparatus, computer program and storage medium for... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method, apparatus, computer program and storage medium for..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method, apparatus, computer program and storage medium for... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3107471

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.