Method for compressing digital documents with control of...

Image analysis – Image compression or coding – Adaptive coding

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C382S176000, C382S224000, C382S233000, C341S051000, C358S426020

Reexamination Certificate

active

06731814

ABSTRACT:

In digital systems image format documents are often compressed to save storage costs or to reduce transmission time through a transmission channel. Lossless compression can be applied to these documents that can achieve very good compression on regions of the document that are computer rendered such as characters and graphics. However, areas of the document that contain scanned image data will not compress well. Compression technologies such as JPEG can be applied to the document that will work well on scanned, continuous tone, areas of the document. Image quality problems arise with this compression technology, and transform-coding technologies in general, with high contrast edges that are produced by computer rendered objects. The solution to this problem is to apply different compression technologies to the document to optimize image quality and compressibility.
A method for digital image compression of a raster image is disclosed which uses different compression methods for selected parts of the image and which adjusts the compression and segmentation parameters to control the tradeoff of image quality and compression. The image, including rendering tags that can accompany each pixel, is encoded into a single data stream for efficient handling by disk, memory and I/O systems. The uniqueness of this system is in the content-dependent separation of the image into lossy and lossless regimes, the transmission of only those blocks containing information, and the adjustable segmentation and compression parameters used to control the image data rate compression rate) averaged over extremely small intervals (typically eight scan lines).
The graphics arts world, and Scitex in particular, as exemplified in the TIFF/IT standard (ISO12639:1997E, “Graphic Technology Prepress Digital Data Exchange Tag Image File Format for Image Technology (TIFF/IT)”) have separated documents into continuous tone (CT) pictures and line work (LW), maintaining different resolutions for each and applying different compression techniques to each (JPEG and run length encoding, respectively). The links between the two image planes are found in the LW channel.
U.S. Pat. No. 5,225,911 to Buckley et al. uses similar encodings but replaced the LW channel with several data streams including mask, color, and rendering tags.
Compressed image printing has been used for over a decade for binary images using one of several standard or proprietary formats: CCITT, JBIG, Xerox Adaptive (Raster Encoding Standard, as discussed in Buckley, Interpress. These single plan compression schemes are lossless and, although often quite effective (20:1) can, for some images, give little or no compression.
U.S. patent application Ser. No. 09/206,487 also separates the image into two planes, but each plane is completely sent. Three data streams are used (two image planes and a separation mask) and no mechanism exists to control local data rate.
JPEG is a standard for compressing continuous tone images. The acronym stands for Joint Photographic Experts Group. JPEG is divided into a baseline system that offers a limited set of capabilities, and a set of optional extended system features. JPEG provides a lossy high-compression image coding/decoding capability. In addition to this lossy coding capability, JPEG incorporates progressive transmission and a lossless scheme as well.
JPEG utilizes a discrete cosine transform (DCT) as part of the encoding process to provide a representation of the image that is more suitable to lossy compression. The DCT transforms the image from a spatial representation to a frequency representation. Once in the frequency domain, the coefficients are quantized to achieve compression. A lossless encoding is used after quantization to further improve compression performance. The decoder executes the inverse operations to reconstruct the image.
Dictionary based compression methods use the principle of replacing substrings in a data stream with a codeword that identifies that substring in a dictionary. This dictionary can be static if knowledge of the input stream and statistics are known or can be adaptive. Adaptive dictionary schemes are better at handling data streams where the statistics are not known or vary.
Many adaptive dictionary coders are based on two related techniques developed by Ziv and Lempel. The two methods are often referred to as LZ77 (or LZ1) and LZ78 (or LZ2). Both methods use a simple approach to achieve adaptive compression. A substring of text is replaced with a pointer to a location where the string has occurred previously. Thus the dictionary is all or a portion of the input stream that has been processed previously. Using the previous strings from the input stream often makes a good choice for the dictionary, as substrings that have occurred will likely reoccur. The other advantage to this scheme is that the dictionary is transmitted essentially at no cost as the decoder can generate the dictionary from the previously coded input stream. The many variations of LZ coding differ primarily in how the pointers are represented and what the pointers are allowed to refer to.
LZ1 is a relatively easy to implement version of a dictionary coder. The dictionary in this case is a sliding window containing the previous data from the input stream. The encoder searches this window for the longest match to the current substring in the input stream. Searching can be accelerated by indexing prior substrings with a tree, hash table, or binary search tree. Decoding for LZ1 is very fast in that each code word is an array lookup and a length to copy to the output (uncoded) data stream.
In contrast to LZ1, where pointers can refer to any substring in the window of prior data, the LZ2 method places restrictions on which substrings can be referenced. However, LZ2 does not have a window to limit how far back substrings can be referenced. This avoids the inefficiency of having more than one coded representation for the same string that can occur frequently in LZ1.
LZ2 builds the dictionary by matching the current substring from the input stream to a dictionary that is stored. This stored dictionary is adaptively generated based on the contents of the input stream. As each input substring is searched in the dictionary, the longest match will be located, but starting at the current symbol in the input stream. So if the character “a” was the first part of a substring, then only substrings that started with “a” would be searched. Generally this leads to a good match of input substring to substrings in the dictionary. However, if a substring “bacdef” were in the dictionary, then “acdef” from the input stream would not match this entry since the substring in the dictionary starts with “b”. This is different from LZ1, which is allowed to generate a best match anywhere in the window and could generate a pointer to “acdef”.
U.S. Pat. No. 5,479,587 discloses a print buffer minimization method in which the raster data is compressed by trying different compression procedures with increasing compression ratios until the raster data is compressed sufficiently to fit in a given print buffer. Each time, a compression procedure with a higher compression ratio is selected from a predefined repertoire of such procedures, ranging from lossless ones such as run-length encoding to lossy ones. Generally, lossless encoding is efficient on text and line art data while lossy encoding is effective on image data. However, this method may produce poor print quality when the nature of the raster page calls for lossy compression in order to achieve a predetermined compression ratio. This is because only one of the selected compression procedure is summarily applied across each strip of the page and when the strip contains both image data as well as text or line art data, the lossy compression procedure will generally blur sharp lines that usually delineate text or line art data or may introduce undesirable artifacts.
European Patent Publication No. 0597571 discloses a method in which the types of objects in a page are first extracted and the bou

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method for compressing digital documents with control of... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method for compressing digital documents with control of..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method for compressing digital documents with control of... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3206311

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.