Encoding and decoding apparatus using context

Coded data generation or conversion – Digital code to digital code converters – Adaptive coding

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C341S106000

Reexamination Certificate

active

06549148

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to a coding apparatus and a decoding apparatus which can be optimally applied in compressing and reconstructing various data such as CAD data, document data, etc.
2. Description of the Related Art
Recently, an increasing volume of various types of data such as character codes, image data, etc. have been processed in a computer. When such large volume of data is stored and transmitted to a distant destination, it is common to compress the data with the redundant portion of the data removed to reduce the storage capacity and improve the transmission speed.
There are two common data compressing systems. They are a dictionary type coding system based on the similarity in data sequences; and a probability statistic type coding system based on the frequency of occurrences of data strings.
A typical example of the dictionary type coding system is an LZ77 system and an LZ78 system.
In the LZ77 system, a predetermined buffer is provided, the position of the previous data matching in longest length is retrieved from the previously input data in the buffer, and the matching position and the matching length are used as codes.
FIG. 1
shows the coding method in the conventional LZ77 system.
In
FIG. 1
, assume that ‘a b a b c d e f a b c d e f g h . . . ’ is input as data to be compressed, and each character of the data to be compressed is assigned an input number indicating an occurrence position.
First, if ‘a’ having the input number
1
is input, then the character ‘a’ is coded as is because it has no preceding characters. Then, when a character ‘b’ having the input number
2
is input, it is compared with the previously input characters. However, there are no characters matching the character ‘b’, the character ‘b’ is coded as is. Furthermore, when a character string ‘a b’ having the input numbers
3
and
4
is input, it is compared with the previously input character strings. As a result, since the character string matches a character string ‘a b’ having the input numbers
1
and
2
, the character string ‘a b’ having the input numbers
3
and
4
is coded using the matching position and matching length. In this example, since the matching position is the position of the character ‘a’ having the input number
1
, and the matching length is 2, ‘(
1
,
2
)’ is coded as the code of the character string ‘a b’ having the input numbers
3
and
4
.
Next, when a character ‘c’ having the input number
5
is input, it does not match any of the previously input characters. Therefore, the character ‘c’ is coded as is. When a character ‘d’ having the input number
6
is input, it does not match any of the previously input characters. Therefore, the character ‘d’ is coded as is. When a character ‘e’ having the input number
7
is input, it does not match any of the previously input characters. Therefore, the character ‘e’ is coded as is. When a character ‘f’ having the input number
8
is input, it does not match any of the previously input characters. Therefore, the character ‘f’ is coded as is.
Then, when a character string ‘a b c d e f’ having the input numbers
9
through
14
is input, it matches a character string ‘a b c d e f’ having the input numbers
3
through
8
. Therefore, the character string ‘a b c d e f’ having the input numbers
9
through
14
is coded using the matching position and the matching length. In this example, since the matching position is position of the character ‘a’ having the input number
3
, and the matching length is 6, ‘(
3
,
6
)’ is coded as the code of the character string ‘a b c d e f’ having the input numbers
9
through
14
.
When a character ‘g’ having the input number
15
is input, it does not match any of the previously input characters. Therefore, the character ‘g’ is coded as is. When a character ‘h’ having the input number
16
is input, it does not match any of the previously input characters. Therefore, the character ‘h’ is coded as is. On the other hand, in the LZ78 system, a previously input character string is entered in a dictionary, and an entered input number is coded.
The LZ77 system has higher compression performance than the LZ78 system for data containing a repetition of a long character string. On the other hand, the LZ78 system has higher compression performance than the LZ77 system for data containing a repetition of a comparatively short character string. The LZ77 system and the LZ78 system are described in, for example, the document “The Introduction to the Document Data Compression Algorithm” by Tomohiko Uematsu published by CQ Publishing Company.
A typical system of the probability statistic type coding system can be the arithmetic coding system and Huffman coding system. Both arithmetic coding system and Huffman coding system obtain a compression effect by allotting a short code length to a character having a high occurrence probability according to the statistic occurrence frequency of each character.
The arithmetic coding system is described in, for example, the document “Arithmetic coding revisited” by Alister Moffat et al., 1995, IEEE Data Compression Conference, p202-211. The Huffman coding system is described in, for example, the document “The Introduction to the Document Data Compression Algorithm” by Tomohiko Uematsu published by CQ Publishing Company.
To obtain a higher compression effect, a variable length coding method has been suggested based on the conditional occurrence probability (P[Xt|Xt−1]) in which not the occurrence probability (P(Xt)) of a single character but the dependence (hereinafter referred to as a context) between an input character and its previous is taken into account. This method is described in, for example, the document “Unbounded Length Contexts for PPM” by John G. Cleary et al., 1995, IEEE Data Compression Conference, p52-61.
The probability statistic type coding system as well as the LZ78 system has higher compression performance for data containing a repetition of a comparatively short character string. Normally, the LZ78 system has a higher processing speed than the probability statistic type coding system. On the other hand, the probability statistic type coding system has a higher compression rate than the LZ78 system.
However, the LZ78 system and the probability statistic type coding system have high compression rate for data containing a repetition of a comparatively short character string, but cannot have sufficient compression rate for data containing a repetition of a long character string.
On the other hand, the LZ77 system has high compression rate for data containing a repetition of a long character string, but cannot have sufficient compression rate for data containing a repetition of a comparatively short string.
Therefore, the conventional compression systems have difficulty in obtaining high compression rate for data containing a repetition of long character strings and comparatively short character strings.
The present invention aims at providing a data coding apparatus capable of efficiently compressing both long and short character strings.
SUMMARY OF THE INVENTION
To solve the above described problem, the present invention includes a symbol string detection unit for detecting a second symbol string matching a first symbol string having a predetermined length from an input symbol string; a matching length detection unit for detecting a matching length between a third symbol string following the first symbol string and a fourth symbol string following the second symbol string; and a coding unit for coding the input symbol string based on the symbol string detected by the symbol string detection unit and the matching length detected by the matching length detection unit.
Thus, for input data having a repetition of long symbol strings, a part of matching symbol string can be coded based on the matching length. Accordingly, the input data having a repetition of long symbol strings can be efficiently compressed. In addition, since a remaining portion of a matching symbol

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Encoding and decoding apparatus using context does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Encoding and decoding apparatus using context, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Encoding and decoding apparatus using context will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3058545

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.