Method and apparatus for efficient identification of...

Data processing: presentation processing of document – operator i – Presentation processing of document – Layout

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C715S252000

Reexamination Certificate

active

06978419

ABSTRACT:
Disclosed is a computer-assisted method for finding duplicate or near-duplicate documents or text spans within a document collection by using high-discriminability text fragments. Distinctive features of the documents or text spans are identified. For each pair of documents or text spans with at least one distinctive feature in common, the distinctive features of each document or text span are compared to determine whether the pair is duplicates or near-duplicates. An apparatus for performing this computer-assisted method is also disclosed.

REFERENCES:
patent: 4807182 (1989-02-01), Queen
patent: 5258910 (1993-11-01), Kanza et al.
patent: 5544049 (1996-08-01), Henderson et al.
patent: 5634051 (1997-05-01), Thomson
patent: 5680611 (1997-10-01), Rail et al.
patent: 5771378 (1998-06-01), Holt et al.
patent: 5898836 (1999-04-01), Freivald et al.
patent: 5913208 (1999-06-01), Brown et al.
patent: 5933823 (1999-08-01), Cullen et al.
patent: 5978828 (1999-11-01), Greer et al.
patent: 5983216 (1999-11-01), Kirsch et al.
patent: 6070158 (2000-05-01), Kirsch et al.
patent: 6092065 (2000-07-01), Floratos et al.
patent: 6098034 (2000-08-01), Razin et al.
patent: 6104990 (2000-08-01), Chaney et al.
patent: 6119124 (2000-09-01), Broder et al.
patent: 6185614 (2001-02-01), Cuomo et al.
patent: 6240409 (2001-05-01), Aiken
patent: 6246977 (2001-06-01), Messerly et al.
patent: 6263348 (2001-07-01), Kathrow et al.
patent: 6295529 (2001-09-01), Corston-Oliver et al.
patent: 6353824 (2002-03-01), Boguraev et al.
patent: 6353827 (2002-03-01), Davies et al.
patent: 6356633 (2002-03-01), Armstrong
patent: 6366950 (2002-04-01), Scheussler et al.
patent: 6442606 (2002-08-01), Subbaroyan et al.
patent: 6470307 (2002-10-01), Turney
patent: 6473753 (2002-10-01), Katariya et al.
patent: 6546490 (2003-04-01), Sako et al.
patent: 6547829 (2003-04-01), Meyerzon et al.
patent: 6549897 (2003-04-01), Katariya et al.
patent: 6598054 (2003-07-01), Schuetze et al.
patent: 6615209 (2003-09-01), Gomes et al.
patent: 6628824 (2003-09-01), Belanger
patent: 6643686 (2003-11-01), Hall
patent: 6658423 (2003-12-01), Pugh et al.
patent: 6697998 (2004-02-01), Damerau et al.
patent: 6718363 (2004-04-01), Ponte
patent: 6741743 (2004-05-01), Stalcup et al.
Lee et al., Duplicate detection for symbolically compressed documents, IEEE Sep. 1999, pp. 305-308.
Doermann et al., The detection of duplicates in document image databases, IEEE 1997, pp. 314-318.
Lopresti, Models and algorithms for duplicate document detection, IEEE Sep. 1999, pp. 297-300.
Jones et al., Phrasier: a System for Interactive Document Retrieval Using Keyphrases, ACM Aug. 1999, pp. 1-8.
Data Structures and Algorithms; Aho et al.; Addison-Wesley Publishing Company; Apr. 1987; pp. 189-192.
What Can We Do with Small Corpora? Document Categorization Via Cross-Entropy; Patrick Juola; Proceedings of Workshoop on Similarity and Categorization, 1997.
Shivakumar et al., Finding Near-Replicas of Documents on the Web, NEC ResearchIndex, 1998, abstract.
Bharat, A Comparison of Techniques to Find Mirrored Hosts on the WWW, NEC ResearchIndex, 1999, abstract.
Chowdhury et al., Collection Statistics for Fast Duplicate Document Detection, Google search, 1999, all.
Brenda S. Baker,On Finding Duplication and Near-Duplication in Large Software Systems, Reverse Engineering, 1995, Proceedings of 2ndWorking IEEE Conference, ISBN: 0-8186-7111-4; Jul. 1995, pp. 86-95; Toronto, Ontario, Canada.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and apparatus for efficient identification of... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method and apparatus for efficient identification of..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for efficient identification of... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3507410

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.