Arithmetic coding and decoding methods and related systems

Image analysis – Image compression or coding – Adaptive coding

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C382S232000, C341S051000, C341S107000

Reexamination Certificate

active

06560368

ABSTRACT:

TECHNICAL FIELD
This invention relates to arithmetic coding and decoding methods and to related systems.
BACKGROUND OF THE INVENTION
Data compression seeks to reduce the number of bits that are used to store or transmit information. This is important in today's market where there is a continuing pressure to increase the speed and volume of data transmission. Many systems for data compression use one of three techniques, or a variation thereof.
A first technique is a string matching technique. String matching techniques parse streams of data and look for repetitions of strings that have already occurred. This works well for textual data, as the words “the”, “and”, “or” and the like occur frequently in the English language. For binary data, string matches occur less frequently or not at all. String matching techniques do not work well on data that is random in nature, such as on data associated with audio/visual information. Compression algorithms that fall into this category include LZ77, LZ78, and LZW—which are three of the most popular compression algorithms on the market today.
A second technique is a “rate of language” technique. Rate of language techniques exploit the fact that in most data certain values occur more frequently than other values. These techniques substitute short codes for more frequently occurring values, and longer codes for less frequently occurring values. By using shorter codes more often, data can be compressed. This type of compression is normally limited to 8:1 or 16:1 compression ratios depending on implementation factors, and more realistically are in the range of 1.5:1 to 2:1. Huffman encoding algorithms use this technique.
A third technique is a “rate of change” technique. Rate of change techniques utilize the notion of looking at the first differential of plotted data. This approach recognizes that for certain types of data, the first differential is more structured than the data itself. This can be especially true for image data. The data can be encoded as the amount that the data varies from a starting point, after which this new data can be compressed by one or more of the previously-mentioned techniques. While rate of change techniques work well for binary data, they do a poor job of compressing textual data. Current implementations of this technique include the Discrete Cosine Transformation used to compress JPEG image and MPEG video files.
In the past fifteen years, arithmetic coding has emerged as a candidate to replace Huffman coding. This technique bypasses the idea of replacing an input symbol with a specific code. Rather, it replaces a stream of input symbols with a number that can comprise a single floating point output number. More bits are needed in the output number for longer, more complex messages. The output from an arithmetic coding process is a single number less than 1 and greater than or equal to 0. This single number can be uniquely decoded to create the exact stream of symbols that went into its construction. In one exemplary prior art technique, to construct the output number the symbols are assigned a set of probabilities. For example, the message “BILL GATES” would have a probability distribution like this:
Character
Probability
SPACE
1/10
A
1/10
B
1/10
E
1/10
G
1/10
I
1/10
L
2/10
S
1/10
T
1/10
Once character probabilities are known, individual symbols are assigned a range along a probability line, nominally 0 to 1. It does not matter which characters are assigned which segments of the range, as long as it is done in the same manner by both the encoder and the decoder. The nine-character symbol set used here then looks like the following:
Character
Probability
Range
SPACE
1/10
0.00 > = r > 0.10
A
1/10
0.10 > = r > 0.20
B
1/10
0.20 > = r > 0.30
E
1/10
0.30 > = r > 0.40
G
1/10
0.40 > = r > 0.50
I
1/10
0.50 > = r > 0.60
L
2/10
0.60 > = r > 0.80
S
1/10
0.80 > = r > 0.90
T
1/10
0.90 > = r > 1.00
Each character is assigned a portion of the 0 to 1 range that corresponds to its probability of appearance. Note that the character “owns” everything up to, but not including, the higher number. The most significant portion of an arithmetic-coded message belongs to the first symbols, or B, in the message “BILL GATES.” To decode the first character properly, the final coded message has to be a number greater than or equal to 0.20 and less than 0.30. To encode this number, the range in which it can fall is tracked. After the first character is encoded, the low end for this range is 0.20 and the high end is 0.30. During the rest of the encoding process, each new symbol will further restrict the possible range of the output number. For example, the next character to be encoded, the letter I, owns the range 0.50 to 0.60 in the new sub-range of 0.2 to 0.3. So, the new encoded number falls somewhere in the 50
th
to 60
th
percentile of the currently established range. Applying this logic further restricts the number to 0.25 to 0.26. This is continued until the entire message is encoded by the final value 0.2572167752. Arithmetic coding techniques are described in a text entitled
The Data Compression Book
, 2
nd
Edition, that is authored by Nelson and Gailly.
This type of arithmetic encoding has problems that are associated with the processing that must take place in order to properly encode a message. Specifically, floating point mathematics is necessary to properly encode and decode messages. This can impact the speed with which messages are compressed and decompressed. Since speed is a critical parameter in compression and decompression processing, this is undesirable.
Problems that are associated with transmitting information such as audio/video information on limited bandwidth networks continue to put pressure on the industry to develop better techniques for compressing data. For example, even with the bandwidth of 10 Mbps Ethernet networks, approximately only four audio/visual streams may be simultaneously broadcast. Many solutions today either deal with increasing the available bandwidth of a connection, or minimizing the amount of data that is necessary to be transmitted. Increased bandwidth is the most straightforward solution to the problem, but it can be cost prohibitive. Accordingly, the focus has turned to solutions that allow for the transmission of more data with less required bandwidth.
This invention arose out of concerns associated with providing improved methods and systems for encoding or compressing data, and decoding or decompressing data.
SUMMARY OF THE INVENTION
Arithmetic coding and decoding methods and compressing and decompressing methods and systems are described. In one embodiment, occurrence values are calculated that represent the number of times individual component values appear in an input stream that is to be encoded. The occurrence values are normalized a first time to ensure that they are all powers of 2. The occurrence values are then normalized a second time to ensure that the sum of all of the occurrence values is a power of 2. Encoding takes place through arithmetic coding techniques that utilize a range or line having a length that is equal to the normalized sum of the occurrence values. Various sub-ranges or segments within the range are assigned to individual component values of the input stream that is to be encoded. A starting position is defined within the range or on the line and a determination is made as to whether the position is within a sub-range or segment that is necessary to encode an individual component value. If the position is not within the appropriate sub-range or segment, then the position is moved to a new position, one or more encoding bits are emitted, and the new position is checked until a position is found to be within the desired sub-range or segment. When this occurs, the sub-range or segment is expanded and the next component value of the input string is retrieved for encoding as described above. Decoding takes place through arithmetic decoding

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Arithmetic coding and decoding methods and related systems does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Arithmetic coding and decoding methods and related systems, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Arithmetic coding and decoding methods and related systems will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3003270

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.