Identifying duplicate documents from search results without comp

Data processing: database and file management or data structures – Database design – Data structure types

Patent

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

707 5, G06F 1730

Patent

active

059132086

ABSTRACT:
A computer system has a document collection of one or more documents and one or more indexes that each include an inverted file with one or more terms. Each of the terms is associated with one or more document identifiers. The index further includes a document catalog that associates each of the document identifiers with one or more attributes, either intrinsic or non intrinsic. A search engine process produces a hit list having one or more hit list entries. Each hit list entry, with one or more hit list attributes, is associated with one of the documents that is determined by the search engine to be relevant to the query. A formatter processor selects one or more of the hit list attributes, identified by a hit list attribute selector and then compares the selected attributes of two or more entries on the hit list to determine whether or not documents associated with these entries are duplicate instances of one another. The determination can be made without examining the content of the document associated with the entries.

REFERENCES:
patent: 4811217 (1989-03-01), Tokizane et al.
patent: 5371852 (1994-12-01), Attanasio et al.
patent: 5483650 (1996-01-01), Pedersen et al.
patent: 5524240 (1996-06-01), Barbara et al.
patent: 5550976 (1996-08-01), Henderson et al.
patent: 5608904 (1997-03-01), Chaudhuri et al.
patent: 5619692 (1997-04-01), Malkemus
patent: 5634051 (1997-05-01), Thomson
patent: 5659729 (1997-08-01), Nielsen
patent: 5701469 (1997-12-01), Brandli et al.
patent: 5704060 (1997-12-01), Del Monte
Ed Bott "Using Windows 95" pp. 26-28, 1995.
Doermann et al., "The Detection of Duplicates in Document Image Databases" IEEE Databases, 1997, pp. 314-318.
Abdelguerfi et al. "Computational Complexity of Sorting and Joining Relations with Duplicates" IEEE Transactions on Knowledge and Data Engineering, vol. 3, No. 4, Dec. 1991, pp. 496-503.
Miller "Detecting Duplicates: A Searcher's Dream Come True" Online, Jul. 1990, pp. 27-34.
"Introduction to Information Storage and Retrieval Systems", W.B. Frakes, Software Engineering Guild, Stering, VA, 22170, pp. 1-12 (Chapter 1 of Information Retrieval Data Structures and Algorithms, Prentice Hall, Englewood Cliffs, NJ 07632).
"Introduction to Data Structures and Algorithms Related to Information Retrieval", R.A. Baeza-Yates, Universidad de Chili, Casilla 2777, Santiago, Chile, pp. 13-27 (Chapter 2 of Information Retrieval Data Structures and Algorithms, Prentice Hall, Englewood Cliffs, NJ 07632).

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Identifying duplicate documents from search results without comp does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Identifying duplicate documents from search results without comp, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Identifying duplicate documents from search results without comp will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-410493

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.