Variable length encoding and decoding of ascending numerical...

Coded data generation or conversion – Digital code to digital code converters – To or from variable length codes

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C341S095000

Reexamination Certificate

active

06600432

ABSTRACT:

FIELD OF THE INVENTION
The invention relates to the compression of sequences of numeric data.
BACKGROUND OF THE INVENTION
Many computer applications involve storing and accessing long sequences of values that increase in value as one progresses through the sequence. Referring to
FIG. 1
, for example, a search engine may store occurrence data for the location of a given word in a given file. The occurrence data may be represented as a sequence of elements, each successive element ascending in value. The values indicating the position within the file where the word appears.
Such sequences consume a great deal of space. This consumes a great deal of memory and other system resources. It also affects the speed at which the data can be stored and accessed. Obviously, this is a significant issue for computer applications, particularly those applications where the results are intended to be provided in real time, perhaps to remote users.
Depending on the application, slower storage and access and larger resource requirements can have a significant commercial effect. For example, a user of a slow internet search engine can easily decide to use an alternative engine, resulting in a loss of revenue.
Many different techniques have been proposed to make the storage and accessing of computer data more efficient. For example, Cohen discloses an Optimization of Data Representations for Transmission of Storage Using Differences from Reference Data in U.S. Pat. No. 5,977,889 issued Nov. 2, 1999. Cohen compresses a representation of differences between a base sequence of data and the actual data to be transmitted or stored.
Tsai et al discloses a Method for Reducing Storage Requirements for Grouped Data Values in U.S. Pat. No. 5,818,877 issued Oct. 6, 1998. The method generates a reduced number of values representing a sequence of grouped data values and partitions the reduced number of values by first mapping data values into groups of symbols and then partitioning the resulting stream of symbols.
None of these techniques are optimized directly for numerical values in ascending sequence. It is desirable to have such methods.
SUMMARY OF THE INVENTION
In a first aspect the invention provides a method of compressing ascending order numerical data in a computer system. The method determines the minimum and maximum gaps between the data set, and reduces all values in the data set except for the first and last values using the formula:
reduced element value=original element value−first element value−((minimum gap−1)×current position in sequence)
Where:
original element value is the value being reduced, first element value is the value of the first element in the sequence (also the first value in the array and the lowest possible value),
minimum gap is the smallest distance between any two adjacent elements in the sequence,
current sequence position in the sequence is the current element's position within the sequence (array), where position 0 is that of the first element in the sequence, and
X is a multiplication sign
The method then stores for each sequence the first and last values, the number of bits required to represent the reduced second element, the number of bits required to store the entire sequence, and discards all leading zeros bits and first non zero bits.
In a second aspect the invention provides a method of encoding a sequence of elements of ascending numerical value. The method reads the sequence of elements of ascending numerical values from a medium using a computer system, determines a minimum gap and a maximum gap of the sequence, reduces each element value according to the formula of the first aspect. After reduction, and in a binary representation of each reduced value, the method strips any leading zero bits and the first non-zero bit, stores the resulting values in a body as a bitstream, and stores in a header the first value, last value, minimum gap and maximum gap of the original sequence, the number of bits in the first value in the body, and an indicator of the start and the end of the body.
The method may, after stripping the leading bits, if the reduced value and the reduced value plus the maximum gap have the same number of bits and there are any remaining leading bits that are the same in the reduced value and the reduced value plus the maximum gap then drop the remaining leading bits that are the same.
The method may, after stripping the leading bits, if the current stripped value has N+1 bits and the previous stripped value in the sequence has N bits and the value of the first N bits of the current stripped value is not less than the previous stripped value then insert in the body before the current stripped value a bump code to the number of bits in the current stripped value.
The method may, after stripping the leading bits, if the current stripped value has N+1 bits and the previous stripped value in the sequence has N bits and the value of the first N bits of the current stripped value is not less than the previous stripped value then insert in the body before the current stripped value a bump code to the number of bits in the current stripped value.
The method may, after stripping the leading bits, if the current stripped value has N+1 bits and the previous stripped value in the sequence has N bits and the value of the first N bits of the current stripped value is less than the previous and the stripped value has N zeros bits stripped value then insert in the body before the current stripped value a bump code to the number of bits in the current stripped value.
The method may, after the determining step, if the maximum gap is the same as the minimum gap then only store in a header the first value, last value, minimum gap and maximum gap of the original sequence, and not store a body.
In a third aspect the invention provides a method of decoding a sequence of ascending numerical values from a header and an optional body of data. The method reads the header from a medium utilizing a computer system to obtain for a sequence the first element value, last element value, minimum gap, maximum gap. If the body exists then the method reads the header to obtain an indication of the start of the body, the length of the first binary value in the body, and the end of the body, and determines if the minimum gap and the maximum gap are the same, and if so then computes the sequence of values from the header. If the minimum gap and the maximum gap are not the same then the method decodes a first binary value in the body from a medium using a computer system and adds in a leading bit, and continues to read the next value. If the read binary value is not a bump code and the value is not greater than the previous binary value then reads an additional bit from the body and adds a leading one bit to the actual binary value to obtain one of the sequence of values. If the read binary value is not a bump code and the value is greater than the previous binary value then the method adds a leading one bit to the actual binary value to obtain one of the sequence of values. If the read binary value is a bump code then the method reads the number of bits specified by the bump code to obtain the actual binary value and then adds a leading one bit to the actual binary value to obtain one of the sequence of values. If the value is not the last value in the body then the method continues above at reading the next value.
Reading the next value may include, if leading bits of the value were dropped on encoding because the bits were the same as in the preceding value then reading the number of bits from the preceding value after the leading bits that are the same, adding in the leading bits that are the same, adding a leading one to the value, and continuing to read the next value. Reading the next value may include, if leading bits are not the same then continuing to read the same number of bits from the body as for the previous value.
The header may store the number of bits for the first binary value in the body, and if the current value is th

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Variable length encoding and decoding of ascending numerical... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Variable length encoding and decoding of ascending numerical..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Variable length encoding and decoding of ascending numerical... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3030846

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.