Arithmetic coding-based facsimile compression with error...

Facsimile and static presentation processing – Facsimile – Picture signal generator

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C382S247000

Reexamination Certificate

active

06760129

ABSTRACT:

FIELD OF THE INVENTION
The present invention generally concerns the field of facsimile transmission of documents, and more particularly concerns an algorithm for compressing facsimile documents based on arithmetic coding.
BACKGROUND OF THE INVENTION
Arithmetic coding is a well known concept that has been applied in a number of information transmission environments, including facsimile. The basic idea behind arithmetic coding is the mapping of a sequence of symbols to be encoded to a real number in the interval [0.0,1.0), where a square bracket “[” indicates that equality is allowed and a curved bracket “)” indicates otherwise. The binary expansion of this real number is then transmitted to the arithmetic decoder, where the inverse mapping is performed to retrieve the encoded symbols.
Conventional coding techniques, including Huffman coding, assign distinct code words to different input symbols and achieve data compression by assigning shorter code words to more probable symbols. In arithmetic coding, on the contrary, there is no assignment of specific code words to different input symbols. Instead, the interval [0.0,1.0) is divided into many sub-intervals, each of which is assigned to different input symbols. Less probable symbols get short intervals, and compression is achieved by assigning longer intervals to more probable symbols.
An illustration of this principle is in
FIG. 1
, which can form the basis of an example of arithmetic coding. The illustration shows that the input symbol takes two possible values, B and W. In a facsimile document, B and W correspond to black and white picture elements (pel) respectively. Let P
b
and P
w
, be the probabilities of occurrences of B and W respectively. To begin with, the coding interval is [0.0,1.0) with the lower bound L
0
0.0 and the upper bound U
0
=1.0 and the range (which is the difference between the upper and lower bound) R
0
=1.0. This coding interval is divided into two sub-intervals: [0.0, P
w
) corresponding to W, and [P
w
1.0) corresponding to B. The lengths of the sub-intervals are proportional to the probabilities of B and W. If P
b
=0.75 and P
w
=0.25, then the interval corresponding to B is three times longer than the interval corresponding to W.
When each symbol in the input sequence is coded, the lower bound, the upper bound and the range will experience a change. If the first symbol in the sequence to be encoded is B, the new coding interval is [P
w
, 1.0). The upper bound remains the same. The new lower bound is L
1
=P
w
, and the new range is R
1
=R
0
P
b
=P
b
. The shaded portion in
FIG. 1
represents the coding interval.
This coding interval is further sub-divided into two sub-intervals: [P
w
, P
w
,+P
b
P
w
) corresponding to W, and [P
w
,+P
b
P
w
1.0) corresponding to B. If the next symbol is a W, the new coding interval is [P
w
, P
w
,+P
b
P
w
). The lower bound remains uncharged at P
w
and the new range is R
2
=R
1
P
w
=P
b
P
w
.
FIG. 1
shows the changes in the lower bound, the upper bound and the range for the first four symbols, the third and fourth symbols both being B. The lower bound after the third symbol is P
w
,+P
w
P
b
2
and the coding interval [P
w
,+P
w
P
b
2
, P
w
,+P
w
P
b
).
In general, if L
(n−1)
,R
(n−1)
are the values of the lower bound and the range before the n
th
symbol is encoded, the following rule is used to change these values:
If the n
th
symbol is B:
L
n
=L
(n−1)
,R
(n−1)
P
w
R
n
=R
(n−1)
P
b
.
If the n
th
symbol is W:
L
n
=L
(n−1)

R
n
=R
(n−1)
P
w
.
The corresponding upper bound U
n
can be obtained by adding L
n
and R
n
. Finally, when all the symbols in the input are encoded, the binary expansion corresponding to any real number in the current coding interval is transmitted to the arithmetic decoder. With the knowledge of the probabilities P
b
and P
w
, the arithmetic decoder follows the same sequence of dividing the intervals to retrieve the coded input.
For example, let the encoder encode only three input symbols and transmit a real number that lies in the coding interval after encoding the first three symbols. The decoder receives this number and lets it be a value, which lies in the interval [P
w
,+P
w
P
b
2
, P
w
,+P
w
P
b
). To begin decoding, the decoder divides the coding interval [0.0, 1.0) into two intervals: [0.0,P
w
) corresponding to W and P
w
,1.0) corresponding to B. Since, value lies in the interval corresponding to B, the decoder decodes the first symbol to be B. Now the new coding interval is [P
w
,1.0), which in turn is sub-divided into two intervals, [P
w
, P
w
,+P
b
P
w
) corresponding to W and [P
w
,+P
b
P
w
1.0) corresponding to B. Since value lies in the interval corresponding to W, the decoder decodes the second symbol to be W and proceeds so on.
An important task in arithmetic coding is the proper estimation of the probabilities P
b
and P
w
. The compression efficiency that can be achieved depends on the accuracy of these probability estimates. In the discussion presented above, it was implicitly assumed that the values of P
b
and P
w
, are known and remain the same during the encoding process. In practice, there does not exist a single set of probability values that can efficiently model different facsimile documents. Even within a single document, there is so much variation (for example the borders predominantly contain white pels, whereas the text in the document contains more black pels), it is not conceivable to use a single set of probability values.
A more efficient scheme is to initialize the probabilities to a suitable value at the start of encoding and adapt the values as the encoding proceeds. In the definition of arithmetic coding, there is no restriction that the probabilities remain unchanged during encoding, as long as the decoder can exactly mimic the changes in the probability values. One way of adapting the probabilities is to initialize counts N
b
and N
w
to 1 and after encoding each input symbol that is either a B or W, the corresponding count is incremented by one. The probabilities of W and B can be obtained from the counts, using,
P
w
=
N
w
N
w
+
N
b
and
P
w
=
N
b
N
w
+
N
b
.
This initialization is an unbiased initialization, since the initial values for N
b
and N
w
are equal. It is also possible to use a biased initialization.
The aforementioned method to update the counts N
b
and N
w
helps to adapt to the local distribution of white and black pels. However, this scheme fails to exploit the redundancy present in the document. The counts N
b
and N
w
gives a measure of the probability of a pel being white or black. But, the probability of a pel being white or black can be better described if the “color” of some of the adjacent pels are known. This set of adjacent pels is defined to be the context. In
FIG. 2
, the pel marked ‘s’ is being presently encoded and the shaded pels form the context for ‘s’. The context in this case consists of 4 pels, each of which can be a B or W. So, there are 16 (=2
4
) possible combinations (or states) that the context can assume. The redundancy in the document can be exploited by maintaining a pair of counts for each possible state of the context. When a pel ‘s’ is encoded, first the state is determined from the context and then the counts corresponding to that state are used in encoding.
It is necessary to exercise caution in selecting the context. Selecting too small a context might not exploit the redundancy in the document well, therefore will decrease the compression efficiency. Selecting too large a context, might just increase the computational complexity and memory requirements without contributing to the compression efficiency. The implementation of arithmetic coding that was used in the simulations for this invention is a modification of the version presented in Witten, I. H., Neal,

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Arithmetic coding-based facsimile compression with error... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Arithmetic coding-based facsimile compression with error..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Arithmetic coding-based facsimile compression with error... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3225671

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.