Data compression with dynamically compiled dictionary

Patent

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

G10L 904

Patent

active

052533255

DESCRIPTION:

BRIEF SUMMARY
BACKGROUND OF THE INVENTION

1. Field of the Invention
The present invention relates to data compression systems which may be used, for example, to reduce the space required by data for storage in a mass storage device such as a hard disk, or to reduce the bandwidth required to transmit data.
2. Background Art
The invention is particularly concerned with data compression systems using dynamically compiled dictionaries. In such systems (see, for example, EP-A-0127815, Miller and Wegman) an input data stream is compared with strings stored in a dictionary. When characters from the data stream have been matched to a string in the dictionary the code for that string is read from the dictionary and transmitted in place of the original characters. At the same time when the input data stream is found to have character sequences not previously encountered and so not stored in the dictionary then the dictionary is updated by making a new entry and assigning a code to the newly encountered character sequence. This process is duplicated on the transmission and reception sides of the compression system. The dictionary entry is commonly made by storing a pointer to a previously encountered string together with the additional character of the newly encountered string.


SUMMARY OF THE INVENTION

According to a first aspect of the present invention there is provided a data compression system including a dictionary to store strings of characters with an index for each of said strings, and means for matching strings of characters of a data stream with strings of characters stored in the dictionary and for outputting the identity of a dictionary entry of a matched string when a following character of the data stream does not match with the stored strings, characterized in that the means for matching is arranged to determine, for each matched string having at least three characters, a sequence of characters from the at least three characters, the sequence including at least a first and a second of said at least three characters, to update the dictionary by extending an immediately-preceding matched string by the sequence.
According to a second aspect there is provided a method of data compression of individual sequences of characters in a data stream including the steps of storing strings in a dictionary with an index for each of said strings, and determining the longest string in the dictionary which matches a current string in the data stream starting from a current input position: the improvement including the steps of determining, for each matched string having at least three characters, a single sequence of characters from the said at least three characters, the single sequence including at least a first and a second of the at least three characters, but not including all of the at least three characters, and updating the dictionary by extending an immediately-preceding matched string by the single sequence.
Preferably, the previously matched string is extended by only the first two characters of said following multiple character matched string.
Preferably, the data compression system compresses data using a Mayne single-pass data compression algorithm.
In known systems, dictionary entries are made either by combining the single unmatched character left over by the process of searching for the longest string match with the preceding matched string or by making entries comprising pairs of matched strings. The former is exemplified by the Ziv Lempel algorithm ("Compression of Individual Sequences via Variable Rate Coding," J. Ziv, A. Lempel, IEEE Trans, IT 24.5, pp. 530-36, 1978), the latter by the conventional Mayne algorithm (Information Compression by Factorizing Common Strings," A. Mayne, E. B. James, Computer Journal, vol. 18.2, pp. 157-60, 1975), and EP-A-012815, Miller and Wegman, discloses both methods. The preferred example of the present invention uses the Mayne data compression algorithm but modifies the process of updating the dictionary so that the entries comprise a matched string and part of the following m

REFERENCES:
patent: 4876541 (1989-10-01), Storer
patent: 5153591 (1992-10-01), Clark
Calingaert, Assemblers, Compilers, and Program Translation, Computer Science Press, 1979, pp. 195-196.
Storer--"Data Compression," Computer Science Press, 1988, pp. 69-74.
Computer Journal, vol. 18, No. 2, 1975, A. Mayne et al: "Information compression by factorising common strings", pp. 157-160.
Proc. Very Large Databases, Cannes, 9-11, Sep. 1981 IEEE, (New York, US), C. A. Lynch et al: "Application of data compression techniques to a large bibliographic database", pp. 435-447.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Data compression with dynamically compiled dictionary does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Data compression with dynamically compiled dictionary, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Data compression with dynamically compiled dictionary will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-1912248

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.