Data compression and restoration system for encoding an...

Image analysis – Image compression or coding – Adaptive coding

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Reexamination Certificate

active

06215906

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to a data compression system and data restoration system that adopts probability statistical coding such as arithmetic coding in which data such as character codes or images is encoded byte by byte. More particularly, this invention is concerned with a data compression system and data restoration system enjoying improved processing performance due to pipelined processing.
2. Description of the Related Art
In recent years, various kinds of data such as character codes, vector information, and images have been handled by a computer. The amount of data to be handled is increasing rapidly. For handing a large amount of data, redundancies in data are eliminated to reduce the amount of data. Consequently, storage capacity can be reduced and data can be transmitted far away to remote places.
One of the methods of compressing various kinds of data using one algorithm is universal coding. The universal coding falls into various techniques. Probability statistical coding such as arithmetic coding is a typical one of the techniques. The present invention can be adapted to various kinds of data not limited to character codes. Hereinafter, according to the terms employed in the information theory, a unit of data or one word shall be referred to as a character, and data composed of any number of consecutive words shall be referred to as a character string.
One of the methods of compressing various kinds of data using one algorithm is universal coding. The universal coding falls into various techniques. There are two typical techniques; dictionary coding and statistical coding.
1. Dictionary coding or Lempel-Ziv coding
Typical technique: LZW, LZSS
2. Statistical coding
Typical technique: multi-value arithmetic coding, dynamic Huffman coding
The dictionary coding is such that past character strings (whose length is variable) are registered in a table called a dictionary, and a subsequent character string is encoded according to information (whose code length is constant) of the location in the dictionary of the longest character string registered in the dictionary. This technique is based on variable-fixed coding (VF) in which a long variable-length character string, for example, several tens of characters are encoded according to fixed-length information of, for example, 12 bits.
For the details of a dictionary coding algorithm, refer to Section
8
Dictionary Techniques in “Text Compression” (Prentice-Hall, 1990) written by T. C. Bell et al
(1)
. By contrast, the probability statistical coding is such that the probability of occurrence of a past individual character (since one character is concerned, the code length is fixed) (including a conditional probability subordinate to an immediately preceding character string) is calculated, and a succeeding character is encoded according to statistical (entropy) information (whose code length is variable) reflecting the probability of occurrence calculated. This technique is based on fixed-variable coding (FV) in which characters (with fixed lengths) are encoded one by one according to statistical information (variable) reflecting the probabilities of occurrence thereof (fixed).
For the details of a probability statistical coding algorithm, refer to Section
5
From Probabilities to Bits in “Test Compression” (Prentice-Hall, 1990) written by T. C. Bell et al
(2)
. Typical statistical coding techniques include Huffman coding and arithmetic coding.
For the details of context modeling for obtaining the subordinate relationship of an immediately preceding character string, refer to Section
6
Context Modeling in “Text Compression” (Prentice-Hall, 1990) written by T. C. Bell et al
(3)
. Herein, the subordinate relationship to an immediately preceding character string is expressed with several characters at most, though an infinitely long character string is used in the dictionary coding.
Consequently, the dictionary coding and probability statistical coding are different from each other in the following points:
1. the dictionary coding handles past data in the form of a character string itself, while the probability statistical coding handles it in the form of a probability of occurrence; and
2. the dictionary coding handles fixed-length data as an object of encoding, while the probability statistical coding handles variable-length (in principle) data. Thus, the dictionary coding and probability statistical coding are fundamentally different from each other in terms of compression mechanism. Herein, multi-value arithmetic coding that handles mainly a data stream or byte stream of an English text or the like is taken as an example of universal coding.
Two encoding techniques to which the arithmetic coding is broadly divided have been proposed; binary arithmetic coding and multi-value arithmetic coding. The encoding techniques differ from each other in a point that the binary arithmetic coding handles two digits corresponding to bits
0
and
1
as a unit of data to be encoded, while the multi-value arithmetic coding handles many digits corresponding to, for example, one byte of 8 bits as a unit of data to be encoded. A typical example of implementation of the binary arithmetic coding is a QM coder employed in JBIG entropy coding that is a standard technique of binary image compression recommended by the CCITT or ISO. For details, for example, refer to Chapter 3 Arithmetic Coding in “International Standards of Multiprocessor Media Coding” (Maruzen, p68-82, June 1991)
(4)
.
Typical examples of the multi-value arithmetic coding are Witten coding and Abrahanson coding. For details, for example, refer to “Arithmetic Coding for Data Compression” (Communications of Association for Computing Machinery, Vol. 30(6), p.520-540, July 1987) written by I. H; Witten et al
(5)
. and “An Adaptive Dependency Source Mode for Data Compression” (Communications of Association for Computing Machinery, Vol. 32(1), p.77-83, January 1989)(6). The binary arithmetic coding alone is utilized in practice for the reason that it is suitable for images. However, since the multi-value arithmetic coding enjoys the highest compression performance, practical utilization of the multi-value arithmetic coding is expected.
The probability statistical coding requires, as shown in
FIG. 1
, an occurrence frequency modeling unit
400
and entropy coding unit
402
. The occurrence frequency modeling unit
400
fetches an input character and an immediately preceding character string (context) and calculates the occurrence frequency of the input character in terms of the subordinate relationship to the immediately preceding character string. The entropy coding unit
402
carries out variable-length encoding to produce a code dynamically on the basis of the occurrence frequency calculated by the occurrence frequency modeling unit
400
. The further details will be described. Take for instance a character string abc composed of three characters a, b, and c as shown in FIG.
2
A. The relationship to the immediately preceding character string is expressed in the form of a tree structure shown in FIG.
2
B. The occurrence frequency modeling unit
400
counts up the number of occurrences at every occurrence of a character string linking characters at nodes of the tree structure shown in
FIG. 2B
, and thus obtains the subordinate relationship to the immediately preceding character string, for example, a conditional probability. A context acquisition method for obtaining the subordinate relationship of such an input character to an immediately preceding character string falls into a method of acquiring a fixed-order context and a method of acquiring a blend context. Herein, the number of characters implied by a context is referred to as an order. The method of acquiring a fixed-order context is a method for fixing the number of characters in a context. Taking a two-order context for instance, the occurrence frequency modeling unit
400
acquires the context of a character linked to two immediately preceding characters x
2

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Data compression and restoration system for encoding an... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Data compression and restoration system for encoding an..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Data compression and restoration system for encoding an... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2464687

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.