Comparing contents of electronic documents

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S793000

Reexamination Certificate

active

06324555

ABSTRACT:

BACKGROUND
The invention relates to techniques to compare electronic documents.
Some computer application programs can compare two files based on the alphanumeric text contents. Applications that compare program source code generally compare two text files line-by-line. Only the text is analyzed whereas other file aspects, such as formatting, are ignored. Conventional word processing applications also compare different files or different versions of a file based only on their text contents.
Word processing applications increasingly produce content-rich files that integrate with text various objects, such as images, graphics, layout, color spaces, annotations, and so on. When such files are compared with the editing tools of the application programs, differences between the integrated objects are not detected, although such differences affect the appearance of a page on a display screen or on a printed page.
SUMMARY
In general, the process of the invention compares and matches two documents page by page. Each page has objects, such as text, graphics, images, color spaces, annotations, and so on. One example for documents of this type are Portable Document Format (PDF) files from Adobe Systems, Inc. PDF files are arranged as a sequence of individual pages. Each page has objects, such as text, graphics, images, color spaces, annotations, and so on. Such files cannot be compared with conventional text-based document compare routines.
According to an aspect of the invention, a method executed in a computer system includes comparing pages of a first document to pages of a second document page by page.
According to a further aspect of the invention, a method executed in a computer system for comparing electronic documents on a page-by-page basis includes storing in a hash table a hash value of page attributes of a first document and using the hash value of a page of the second document to determine whether there is a match of the hash value in the hash table. The method also includes pairing the page of the second document with the page of the first document that has the hash value in the hash table.
According to a further aspect of the invention, a computer program product for comparing electronic documents, residing on a computer-readable medium includes instructions for causing a computer to store in a hash table a hash value of page attributes of a first document. The program includes instructions to form hash values of page attributes of a second document and use the hash value of a page of the second document to determine whether there is a match of the hash value in the hash table. The program further includes instructions to pair the page of the second document with the page of the first document that has the hash value in the hash table.
According to a still further aspect of the invention, a method executed in a computer system for comparing electronic documents on a page-by-page basis, includes computing a first digest of marking operators of each page of a first and a second document and pairing the pages of the first document with the pages of the second document that have identical first digests. The method also includes computing a second digest of a rendered bitmap of each page of the first document that is still unpaired, and of each page of the second document, that is still unpaired and pairing the still unpaired of the first document with the still unpaired pages of the second document that have identical second digests. The method also includes computing a third digest of a subset of the rendered bitmap of each page of the first document that is still unpaired, and of each page of the second document that is still unpaired, and pairing the still unpaired pages of the first document with the still unpaired pages of the second document that have identical third digests and pairing an unpaired page in the first document which immediately follows a paired page in the first document, with the page in the second document which immediately follows the other of the paired pages in the second document, if the page which immediately follows the other of the paired pages in the second document is also still unpaired. The method also includes pairing any still unpaired pages in the first and second document with a blank page inserted in the second and first document and highlighting differences between paired pages that do not have identical first digest, on a visual rendering of the paired pages.
One or more advantages can be provided by the invention. Documents that include objects such as graphical objects can be compared. Documents can be compared based on various visually perceptible characteristics. Documents with differences in their visually perceptible appearance can be checked and their differences highlighted. Documents can be compared page by page to identify differences in such objects, which objects can include text, graphics, images, color spaces, annotations, and so on. The invention can compare documents by computing a digest for each page of a first document and a second document and comparing digests of the pages of the first and second documents to determine matches between pages of the first and second documents. The invention can apply highlighting to identify differences between pages of the documents that do not match, on a visual rendering of the pages. The visual rendering of the pages can be on a display, printer, and so forth.


REFERENCES:
patent: 5465353 (1995-11-01), Hull et al.
patent: 5699453 (1997-12-01), Ozaki
patent: 5701469 (1997-12-01), Brandli et al.
patent: 5778361 (1998-07-01), Nanjo et al.
patent: 5835638 (1998-11-01), Rucklidgw et al.
patent: 5890177 (1999-03-01), Moody et al.
patent: 5893908 (1999-04-01), Cullen et al.
patent: 5898836 (1999-04-01), Freivald et al.
patent: 5912974 (1999-06-01), Holloway et al.
patent: 5941944 (1999-08-01), Messerly
patent: 5953451 (1999-09-01), Syeda-Mahmood
patent: 5978842 (1999-11-01), Noble et al.
patent: 5991466 (1999-11-01), Ushiro et al.
patent: 5999664 (1999-12-01), Mahoney et al.
patent: 6018749 (2000-01-01), Rivette et al.
patent: 6029175 (2000-02-01), Chow et al.
patent: 6119124 (2000-02-01), Broder et al.
Simpson, Mastering WordPerfect 5.1 & 5.2 for Windows, Ch. 2 (pp. 26-53), 1993.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Comparing contents of electronic documents does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Comparing contents of electronic documents, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Comparing contents of electronic documents will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2611899

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.