Method of compressing data with an alphabet

Coded data generation or conversion – Digital code to digital code converters – Adaptive coding

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C341S087000

Reexamination Certificate

active

06262675

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates in general to data compression, and in particular, to a method for compressing and decompressing data with an alphabet.
2. Description of Related Art
The Liv-Zempel 77 (LZ77) method is a well known method of data compression and decompression. However, it is inefficient in terms of its code space usage. This can be illustrated by an encoding and decoding example using the prior art LZ77 algorithm.
The following terms are used in describing the prior art LZ77 method:
Input Stream: a sequence of characters to be compressed;
Character: a basic data element in the input stream;
Coding Position: a position of the character in the input stream that is currently being coded (the beginning of a lookahead buffer defined below);
Lookahead Buffer: a character sequence from the coding position to an end of the input stream;
Window: a “backward” window of size W that contains W characters from the coding position, i.e., the last W characters previously processed;
Pointer: a pointer to a match in the window W that also specifies the length of the match.
With regard to encoding, the prior art LZ77 method searches the window for the longest match with the beginning of the lookahead buffer and outputs a pointer to that match. Since it is possible that not even a one-character match can be found, the output cannot contain just pointers. The prior art LZ77 method solves this problem as follows: after each pointer, it outputs the first character in the lookahead buffer after the match; if there is no match, then it outputs a null-pointer and the character at the coding position. Then, the coding position is moved further by one.
Specifically, the steps of the prior art LZ77 encoding method comprise the following:
(i) Set the coding position to the beginning of the input stream.
(ii) Find a match in the backward window W for the lookahead buffer.
(iii) output the triple (B,L)C with the following meanings:
(1) B is the number of characters to be traversed backwards in the backward window W in order to get to the starting location of the match. If there is no match, then B takes a null value (0) without loss of generality.
(2) L is the number of characters matched.
(3) C is the first character in the lookahead buffer that did not match.
(iv) If the lookahead buffer is not empty, then move the coding position (and the backward window W) L+1 characters forward and return to step (ii); otherwise, terminate.
This is best illustrated by providing an example of the prior art LZ77 encoding method. The following table describes the input data for the example, wherein the first row indicates the position and the second row indicates the corresponding character:
Pos
1
2
3
4
5
6
7
8
9
Char
A
A
B
C
B
B
A
B
C
The following table illustrates the prior art LZ77 encoding method performed on the above input data:
Step
Pos
W
Match
Char
Output
1.
1


A
(0,0) A
2.
2
A
A
B
(1,1) B
3.
4
AAB

C
(0,0) C
4.
5
AABC
B
B
(2,1) B
5.
7
AABCBB
AB
C
(5,2) C
The following describes the columns in the above table:
The column Step indicates the number of the encoding step. It completes each time the prior art LZ77 encoding method makes an output. With the prior art LZ77 method, this happens in each step of the encoding method above at (iii).
The column Pos indicates the coding position. The first character in the input stream has the coding position
1
.
The column W shows the backward window.
The column Match shows the longest match found in the window.
The column Char shows the first character in the lookahead buffer after the match.
The column Output presents the output in the format (B,L)C. (B,L) is the pointer to the Match, which provides the following instruction to the decoding method: “Go back B characters in the window and copy L characters to the output.” C is the next character.
With regard to the prior art LZ77 decoding method, the window is maintained the same way as during the encoding method. In each step, the decoding method reads a triple (B,L)C from the input. The decoding method outputs the sequence from the window specified by (B,L) and the character C.
The compression ratio achieved by the prior art LZ77 method is very good for many types of data, but the encoding method can be quite time-consuming, since there are a lot of comparisons to perform between the lookahead buffer and the window. On the other hand, the decoding method is very simple and fast. Memory requirements are low both for the encoding and the decoding methods, since the only structure held in memory is the window, which is usually sized between 4 and 1 kilobyte.
However, the prior art LZ77 method suffers from the problem of non-optimal code space usage, because it uses two integers and one character for a code. The first integer is the starting position of the match, the second integer is the length of the match, and the character is the first non-matching character after the match. In practical terms, including the first non-matching character after the match leads to compression inefficiency.
Other prior art methods exist to code this character selectively, based on an efficiency criteria. However, each requires that the decoding method check whether it is to decode a character of a string from the window. In logic or instruction terms, the check requires a conditional branch, once for every compressed code, resulting in inefficient logic. For systems that are read intensive (such as database management systems where reads outnumber writes by 3-to-1 or more), it is necessary to speed up the decoding method, and removing conditional branches from the decoding method is one means of doing so. Thus, there is a need in the art for an improved LZ77 method that not only optimizes code space usage, but also the speed of decoding.
SUMMARY OF THE INVENTION
To overcome the limitations in the prior art described above, and to overcome other limitations that will become apparent upon reading and understanding the present specification, the present invention discloses a method, apparatus, and article of manufacture for compressing and decompressing data using an embedded alphabet to reduce code space in the compressed data.


REFERENCES:
patent: 4748638 (1988-05-01), Friedman et al.
patent: 4814746 (1989-03-01), Miller et al.
patent: 4876541 (1989-10-01), Storer
patent: 5406279 (1995-04-01), Anderson et al.
patent: 5412384 (1995-05-01), Change et al.
patent: 5424732 (1995-06-01), Iyer et al.
patent: 5448733 (1995-09-01), Satoh et al.
patent: 5455576 (1995-10-01), Clark, II et al.
patent: 5521597 (1996-05-01), Dimitri
patent: 5532693 (1996-07-01), Winters et al.
patent: 5561421 (1996-10-01), Smith et al.
patent: 5572206 (1996-11-01), Miller et al.
patent: 5608396 (1997-03-01), Cheng et al.
patent: 5694125 (1997-12-01), Owsley et al.
Wan et al. (1994) IEEE International Conference on Neural Networks, pp. 4384-4389.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method of compressing data with an alphabet does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method of compressing data with an alphabet, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method of compressing data with an alphabet will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2461034

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.