Data processing: database and file management or data structures – Database design – Data structure types
Reexamination Certificate
1998-06-12
2003-03-04
Robinson, Greta (Department: 2177)
Data processing: database and file management or data structures
Database design
Data structure types
C707S793000
Reexamination Certificate
active
06529912
ABSTRACT:
BACKGROUND OF THE INVENTION
(1) Field of the Invention
The present invention relates to a data compressing apparatus and a data decompressing apparatus, a data compressing method and a data decompressing method, a data compressing or decompressing dictionary creating apparatus, and a computer readable recording medium storing a data compressing program or data decompressing program for use when various data such as text data (character codes), image data, etc., is compressed or decompressed.
(2) Description of the Related Art
In recent years, various kinds of data such as character codes, image data, etc., is handled in a computer and a quantity of the data is increasing. Because of this, it is general in the computer that a redundant part in handled data is omitted and compressed so that a storage capacity necessary when the data is managed is decreased, or a transmission rate or transmission efficiency is increased at the time of data communication with a remote place in order to decrease a communication cost.
As data compressing methods, there are, for example, dictionary-based coding in which analogy of inputted data strings is used to code and compress the data strings, and statistical coding in which a frequency of occurrence of inputted data strings is used to code and compress the data strings. Hereinafter, one word of data (one alphabetic character, for example) is referred to as “a character”, whereas a train of an arbitrary number of words of data is referred to as “a character string”.
In concrete, in the former dictionary-based coding, a predetermined number (code) is assigned to a character or a character string occurring in a data string (data file, for example) that is an object of compression to create a dictionary (code table), and an actually inputted character (character string) is coded on the basis of the dictionary. A character (character string) having a higher probability is generally assigned as a longer character string in the dictionary so that a compression ratio is improved.
LZ
77
and LZ
78
(refer to “Introduction to Document Data Compression Algorithm”, Tomohiko Uematsu, CQ Shuppansha, for example) are representatives of the dictionary coding system.
In LZ
77
, characters (character strings) occurring in an inputted data string are stored in a buffer in advance, and a storing position (address) and a length of characters (a character string) in the buffer longest-matching with inputted characters (a character string) that is an object of compression are coded as a code of the inputted character (character string). In LZ
78
, characters (a character string) occurring in an inputted data string in the past are registered in a dictionary, and a register number of characters (a character string) in the dictionary matching with inputted characters (a character string) that is an object of compression is coded as a code of the inputted characters (character strings).
On the other hand, in the latter statistical coding, a frequency of occurrence of each character (character string) occurring in an inputted data string are calculated, and a shorter code is assigned to a character (character string) having a higher probability so as to improve a compression ratio.
Arithmetic coding (refer to “Arithmetic Coding for Data Compression”/IAN H. WITTEN, et al., Communication of the ACM, Vol.130, No.6, P520-540, “An Adaptive Dependency Source Model for Data Compression Scheme”/D. M. Abrahamson, Communication of the ACM, Vol.132, No.1, P77-83) and Haffman coding (refer to “Dynamic Haffman Coding”/Donald E. Knuth, Journal of Algorithms, Vol.6, P163-180) are representatives of statistical coding.
As statistical coding, there is proposed another system, in which inputted characters are coded into a variable-length code on the basis of, not a probability of one character, but a conditional probability in consideration of dependency between an inputted character and a character immediately before the inputted character (hereinafter, referred to as a context) as shown in
FIG. 48
, for example, in order to accomplish a higher compression effect (hereinafer, such variable-length coding using a conditional probability in consideration of a context is referred as a context modeling).
In concrete, the context modeling collects a context from an inputted data string (original data), successively registers characters that are objects of coding [refer to FIG.
49
(
a
)] in a dictionary of a tree structure [hereinafter referred as a context tree, refer to FIG.
49
(
b
)] counts occurrences of each character each time a character string is inputted which traces characters registered in respective nodes of the context tree to obtain a conditional probability, and codes the original data on the basis of the obtained probability.
Both of dictionary-based coding and statistical coding are classified into three types as shown in items (1) through (3) below according to a way of considering occurrences of a data string that is an object of compression (hereinafter referred as a data string to be compressed):
(1) static coding: coding a character (character string) in a data string to be compressed according to occurrences set in advance irrespective of actual occurrences of the data string to be compressed;
(2) semi-adaptive coding: coding each character (character string) according to occurrences of each character (character string) obtained by scanning all characters (character strings) in a data string to be compressed before compression; and
(3) adaptive coding: re-counting occurrences of a character (character string) each time the same character (character string) as a character (character string) inputted in the past is inputted, and coding an inputted character (character string) according to the re-counted occurrences.
In static coding, a computer reads a dictionary set irrespectively of occurrences of an actual data string to be compressed from a memory or the like (set a dictionary: Step
1
), fixedly uses the dictionary read out until inputted characters (character strings) end (until judged YES at Step A
4
) to code each of the inputted characters (character strings) (Steps A
2
and A
3
, NO route at Step A
4
), as shown in
FIG. 50
, for example.
On the other hand, in semi-adaptive coding, the computer successively registers data string (characters/character string) to be compressed in a dictionary (Steps B
1
and B
2
, NO route at Step B
3
), and assigns a code to each of the characters (character strings) registered in the dictionary according to occurrences of the character (character string) to code the dictionary (from YES route at Step B
3
to Step B
4
), as shown in FIG.
51
, for example.
The computer then puts back a pointer pointing a character (character string) to be inputted to the head of a data string to be compressed (Step B
5
), re-inputs the above data string (characters/character strings) to be compressed (Step B
6
), and codes each of the character (character strings) while referring to the above dictionary (Step B
7
, NO route at Step B
8
) until coding of all the data strings to be compressed is completed (until judged YES at Step B
8
).
In adaptive coding, the computer codes inputted a character (or a character string) referring to a dictionary set in advance (Step C
2
) when a data string (character or character string) to be compressed is inputted (Step C
1
) similarly to static coding in the beginning, as shown in
FIG. 52
, for example. After that, the computer re-counts occurrences of a coded character (character string), registers a code according to the obtained occurrences as a new code of the character (character string) in the dictionary (Step C
3
), and codes each character (character string) while updating the dictionary (NO route at Step C
4
) until coding of all the data strings to be compressed is completed (until judged YES at Step C
4
).
Above static coding performs coding fixedly using a dictionary set in advance. Therefore, the static coding can always achieve a constant compression ratio with respect to d
Morihara Takashi
Satoh Noriko
LandOfFree
DATA COMPRESSING APPARATUS AND A DATA DECOMPRESSING... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with DATA COMPRESSING APPARATUS AND A DATA DECOMPRESSING..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and DATA COMPRESSING APPARATUS AND A DATA DECOMPRESSING... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3039941