Coded data generation or conversion – Digital code to digital code converters – To or from packed format
Reexamination Certificate
2001-10-31
2002-12-10
Jeanpierre, Peguy (Department: 2819)
Coded data generation or conversion
Digital code to digital code converters
To or from packed format
C341S050000
Reexamination Certificate
active
06492917
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to an improved system and method for implementing the YK lossless data compression and decompression. More particularly, the present invention relates to a system and method for performing lossless data compression and decompression using irreducible grammar whose elements are represented as linked-list data structures. The present invention proposes a three-module computational architecture for implementing the YK lossless compression and decompression algorithm, with minimal communication across modules.
2. Description of the Related Art
Data compression algorithms have been developed to reduce the size of a data string so that the data string can, for example, be more efficiently stored and transmitted. An example of a grammar-based framework for lossless data compression algorithm is described in a publication by J. C. Kieffer and E.-H. Yang entitled “Grammar based codes: A new class of universal lossless source codes,”
IEEE Transactions on Information Theory,
vol. 46, no. 3, pp. 737-754, May 2000, the entire content of which is incorporated herein by reference.
As described in the Kieffer and Yang publication, to compress a data sequence, a grammar-based code first transforms the data sequence into a context-free grammar, from which the original data sequence can be fully reconstructed by performing parallel substitutions. This context-free grammar, which is a compact representation of original data, is compressed using arithmetic coding. If a grammar-based code transforms each data sequence into an irreducible context-free grammar, then the grammar-based code is universal for the class of stationary, ergodic sources.
Grammar-based codes offer a design framework in which arithmetic coding and string matching capabilities are combined in an elegant manner. Within the framework of grammar-based codes, an efficient greedy grammar transform, known as the Yang-Kieffer grammar transform, is described in a publication by E. H. Yang and J. C. Kieffer entitled “Efficient universal lossless data compression algorithms based on a greedy sequential grammar transform—Part one: Without context models,”
IEEE Transactions on Information Theory,
vol. 46, pp. 755-777, May 2000, the entire content of which is incorporated herein by reference. This grammar transform sequentially constructs a sequence of irreducible context-free grammars from which the original data sequence can be recovered incrementally. The transform sequentially parses the original data in non-overlapping, variable-length phrases, and updates the grammar accordingly. Based on this grammar transform, a sequential lossless data compression, known as the YK algorithm, was developed.
The YK algorithm efficiently encodes the grammar by exploiting the structure of the grammar and using arithmetic coding to encode the sequence of parsed phrases at each step. The YK algorithm jointly optimizes, in some sense, string matching and arithmetic coding capabilities to achieve excellent compression performance on a large class of data. The following is a brief description of the major operations in the YK algorithm.
MAJOR OPERATIONS IN THE YK ALGORITHM
In defining the variables used in the following description, let A be the source alphabet with cardinality |A| greater than or equal to 2. let A
+
denote the set of all finite strings drawn from A, let x=x
1
x
2
K x
n
x=x
1
x
2
K x
n
be a finite string drawn for the alphabet A that is to be compressed, and let S={s
n
,s
1
,s
2
,K} be a countable set, disjoint from A. The symbol s
0
is called the start symbol, while elements of S
+
={s
1
,s
2
,K} are called variables. For j≧1, let S(j)={s
0
,s
1
,K,s
j−1
}, and let S
+
(j)={s
1
,K,s
j−1
}. A context-free grammar G is a mapping from S(j) to (S(j)∪A)
+
for some j≧1. The mapping G is explicitly represented by writing each relationship (s
i
,G(s
i
)) as s
i
→G(s
i
), for i<j. G(s) is called as the production rule for
This grammar is an example of an admissible grammar because one obtains a sequence from A after finitely many parallel replacements of variables in G(s
i
), and every variable s
i
(i<j) is replaced at least one by G(s
i
). The resulting sequence from A is called A-string represented by the respective variable. These A-strings are intended to represent repeated phrases in the input data string. The start symbol is special because the A-string represented by it is the complete input data string.
In addition, this grammar is an irreducible grammar because none of Reduction Rules 1 to 5 described in the second Yang and Keiffer publication referenced above can be applied to it to get a new admissible grammar. Further details of the concepts of admissible grammar and irreducible grammar can be found in that publication. An example of an irreducible grammar representing the data string abacaacabbacbabaccabaccc is shown as follows:
s
0
→s
3
s
1
s
1
bbs
2
bs
4
s
4
c
s
1
→s
2
a
s
2
→ac
s
3
→ab
s
4
→s
3
s
2
c
A grammar-based framework was proposed in a publication by C. Nevill-Manning and I. Witten entitled “Compression and explanation using hierarchical grammar,”
Computer Journal,
vol. 40, pp. 103-116 1997, the entire content of which is incorporated herein by reference. However, the grammar described in that publication need not satisfy the irreducibility property, and hence can be suboptimal.
The main steps of the YK algorithm are briefly summarized here. Let x=x
1
x
2
&Lgr; x
n
be a sequence from A, that is to be compressed. The YK algorithm proceeds through an iterative application of the following three main operations:
1. Parsing: The parsing operation parses the sequence x sequentially into non-overlapping substrings {x
1
,x
2
&Lgr; x
n
2
,K,x
n
t−1
+1
&Lgr; x
n
t
} and builds sequentially an irreducible grammar G
i
for each x
1
&Lgr; x
n
i
, where 1≦i=1, n
1
=1, and n
1
=n. The first substring is x
1
, and the corresponding irreducible grammar G
1
consists of only one production rule s
0
→x
1
. At the i
th
step, suppose that the first i phrases x
1
, x
2
&Lgr; x
n
2
, . . . , x
n
i−1
+1
&Lgr; x
n
i
have been parsed off. Suppose the variable set of the current grammar G
i
is S
+
(j
i
)={s
1
,&Lgr;,s
j
i
−1
}, with j
1
=1. The next substring x
n
i
+1
&Lgr; x
n
i+1
is the longest prefix of x
n
i
+1
&Lgr; x
n
that can be represented by s
j
if such a prefix exists in S
+
(j
i
). If such a prefix exists, the next parsed phrase is s
j
; otherwise, the next parsed phrase is the symbol x
n
i
+1
.
2. Grammar and Search-List Update: Let &agr; denote the last symbol on the right end of G
i
(s
0
). The parsed phrase (denoted by &bgr;) is appended to the right end of G
i
(s
0
) to give the appended grammar G′
1
. The appended grammar is reduced, if possible, using Reduction Rules as described, for example, in the second Yang and Keiffer publication referenced above to yield an irreducible grammar G
i+1
. Define the indicator sequence I:{1,2,&Lgr;,t}→{0,1} as follows: I(1)=0, and for any i>1, I(i) is equal to 0 if G
i+1
equals G′
i
(i.e. reduction was not possible), and 1 otherwise (i.e. reduction was possible). It is proved in ref. 1 that the grammar G′
i
is reducible if and only if &agr;&bgr; is the only non-overlapping repeated pattern of length ≧2 in G′
i
. To determine if the appended grammar is reducible, a search-list L
1
(&ggr;) is maintained for all symbols in A Y S(j
i
), where L
1
(&ggr;) consists of all the elements &eegr; such that the appended grammar G′
i
would have the pattern &ggr;&eegr; as a potential match if the next parsed phrase is &eegr;. Depending on the values of I(i+1) and I(i), there are three possible distinct cases:
Case 0: I(i+1
Banerji Ashish
Goel Saching
Hughes Electronics Corporation
Jeanpierre Peguy
Lauture Joseph
Sales Michael
Whelan John T.
LandOfFree
System and method for implementation of the YK lossless data... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with System and method for implementation of the YK lossless data..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System and method for implementation of the YK lossless data... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2920856