Data processing: database and file management or data structures – Database design – Data structure types
Patent
1996-07-09
1999-06-15
Amsbury, Wayne
Data processing: database and file management or data structures
Database design
Data structure types
707 5, G06F 1730
Patent
active
059132086
ABSTRACT:
A computer system has a document collection of one or more documents and one or more indexes that each include an inverted file with one or more terms. Each of the terms is associated with one or more document identifiers. The index further includes a document catalog that associates each of the document identifiers with one or more attributes, either intrinsic or non intrinsic. A search engine process produces a hit list having one or more hit list entries. Each hit list entry, with one or more hit list attributes, is associated with one of the documents that is determined by the search engine to be relevant to the query. A formatter processor selects one or more of the hit list attributes, identified by a hit list attribute selector and then compares the selected attributes of two or more entries on the hit list to determine whether or not documents associated with these entries are duplicate instances of one another. The determination can be made without examining the content of the document associated with the entries.
REFERENCES:
patent: 4811217 (1989-03-01), Tokizane et al.
patent: 5371852 (1994-12-01), Attanasio et al.
patent: 5483650 (1996-01-01), Pedersen et al.
patent: 5524240 (1996-06-01), Barbara et al.
patent: 5550976 (1996-08-01), Henderson et al.
patent: 5608904 (1997-03-01), Chaudhuri et al.
patent: 5619692 (1997-04-01), Malkemus
patent: 5634051 (1997-05-01), Thomson
patent: 5659729 (1997-08-01), Nielsen
patent: 5701469 (1997-12-01), Brandli et al.
patent: 5704060 (1997-12-01), Del Monte
Ed Bott "Using Windows 95" pp. 26-28, 1995.
Doermann et al., "The Detection of Duplicates in Document Image Databases" IEEE Databases, 1997, pp. 314-318.
Abdelguerfi et al. "Computational Complexity of Sorting and Joining Relations with Duplicates" IEEE Transactions on Knowledge and Data Engineering, vol. 3, No. 4, Dec. 1991, pp. 496-503.
Miller "Detecting Duplicates: A Searcher's Dream Come True" Online, Jul. 1990, pp. 27-34.
"Introduction to Information Storage and Retrieval Systems", W.B. Frakes, Software Engineering Guild, Stering, VA, 22170, pp. 1-12 (Chapter 1 of Information Retrieval Data Structures and Algorithms, Prentice Hall, Englewood Cliffs, NJ 07632).
"Introduction to Data Structures and Algorithms Related to Information Retrieval", R.A. Baeza-Yates, Universidad de Chili, Casilla 2777, Santiago, Chile, pp. 13-27 (Chapter 2 of Information Retrieval Data Structures and Algorithms, Prentice Hall, Englewood Cliffs, NJ 07632).
Brown Eric William
Prager John Martin
Amsbury Wayne
International Business Machines - Corporation
Percello Louis J.
Wallace, Jr. Michael J.
LandOfFree
Identifying duplicate documents from search results without comp does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Identifying duplicate documents from search results without comp, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Identifying duplicate documents from search results without comp will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-410493