Facsimile and static presentation processing – Static presentation processing – Data corruption – power interruption – or print prevention
Patent
1997-11-14
2000-02-01
Grant, II, Jerome
Facsimile and static presentation processing
Static presentation processing
Data corruption, power interruption, or print prevention
358 12, 358 11, G06T 1500, G05B 1100
Patent
active
060209720
ABSTRACT:
A method and apparatus for compressing a corpus of document images into a collective tokenized representation. Initially, documents in the corpus are individually compressed into a document tokenized format. A document image in the document tokenized format is represented using a symbol table and a table of positions. Each symbol in the symbol table is a shape in the original document image. The positions in the table of positions indicates where the symbols in the symbol table are placed to form the document image. Subsequently, the individual symbol tables of each document in the corpus are assembled to form clusters of similar shapes. These clusters are then analyzed to identify the degree of interrelationship between the symbols in the individual symbol tables. Individual document symbol tables with a large number of recurring symbols are grouped together. For each of the groups of symbol tables, a collective symbol table is computed. The collective symbol table improves the compression ratio of a corpus by eliminating redundant shapes appearing in the individual document symbol tables. Also, the collective symbol table advantageously identifies groupings of documents in the corpus which are related because a significant number of similar shapes are used in each of the documents.
REFERENCES:
patent: 5303313 (1994-04-01), Mark et al.
patent: 5305433 (1994-04-01), Ohno
patent: 5321770 (1994-06-01), Huttenlocher et al.
patent: 5331556 (1994-07-01), Black, Jr. et al.
patent: 5504843 (1996-04-01), Catapano et al.
patent: 5539841 (1996-07-01), Huttenlocher et al.
patent: 5778361 (1998-07-01), Nanjo et al.
patent: 5884014 (1999-03-01), Huttenlocher et al.
patent: 5911140 (1999-06-01), Tukey et al.
patent: 5940822 (1999-08-01), Haderle et al.
U.S. Patent Application No. 08/575,305, entitled "Classification of Scanned Symbols into Equivalence Classes," to Daniel Davies, filed Dec. 20, 1995.
U.S. Patent Application No. 08/575,313, entitled "Consolidation Of Equivalence Classes Of Scanned Symbols," to Daniel Davies, filed Dec. 20, 1995.
U.S. Patent Application No. 08/652,864 entitled "Fontless Structured Document Image Representations for Efficient Rendering," to Daniel R. Huttenlocher et al., filed May 23, 1996.
U.S. Patent Application No. 08/655,546 entitled "Method and Apparatus for Comparing Symbols Extracted from Binary Images of Text" William J. Rucklidge et al., filed May 30, 1996.
U.S. Patent Application No. 08/752,497, entitled "Using Fontless Structured Document Image Representations To Render Displayed And Printed Documents At Preferred Resolutions," to Daniel R. Huttenlocher et al., filed Nov. 8, 1996.
Mahoney James V.
Rucklidge William J.
Grant II Jerome
Tran Douglas
Xerox Corporation
LandOfFree
System for performing collective symbol-based compression of a c does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with System for performing collective symbol-based compression of a c, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System for performing collective symbol-based compression of a c will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-942197