Method and apparatus for incremental computation of the...

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Reexamination Certificate

active

07089238

ABSTRACT:
Disclosed are methods and for incrementally updating the accuracy provided by documents in training set of used for automatic categorization. A k-nearest neighbor database includes the documents in the training set, categories, category assignments of the documents and category scores for the documents. A list made up of the nearest neighbors of the documents and corresponding similarity scores contains is maintained by the method. On adding or deleting documents or category assignments, the documents influenced by the changed documents or category assignments are identified. The category scores of the identified documents are updated to be consistent for the updated training set and a new precision and recall curves are computed for the categories including updated category scores. The precision and recall curves may be used to determine an optimal number of documents to maximize the return of relevant documents while minimizing the total number of documents.

REFERENCES:
patent: 6006221 (1999-12-01), Liddy et al.
patent: 6122628 (2000-09-01), Castelli et al.
Arya, S. et al, “An Optimal Algorithm for Approsimate Nearest Neighbor Searching in Fixed Dimentions,” Nov. 1998, Journal of the AMC vol. 45, No. 6, pp. 891-923.
Cutting, D. R., et al. “Scatter/Gather: A Cluster-based Approach to Browsing Large Document Collections,” Jun. 1992, Ann Int'l SIGIR '92 Denmark, pp. 318-329.
Daniel P. LoprestiA Comparison of Text-Based Methods for Detecting Duplication in Document Image DatabasesDocument Recognition and Retrieval VII (IS&T/SPIE Electronic Imaging 2000), Jan. 2000, San Jose, CA.
Narayanan Shivakumar and Hector Garcia-MolinaThe SCAM Approach to Copy Detection in Digital LibrariesDepartment of Computer Science Stanford University Stanford, CA 94305 USA.
William B. Frakes and Ricardo Baeza-YatesInformation Retrieval Data Structures&Algorithmspp. 19-71.
Ian H. Witten et al.Managing Gigabytes Compressing and Indexing Documents and Imagespp. 181-188.
Fazli Can et al.A Dynamic Cluster Maintenance System for Information RetrievalSIGIR 1987 pp. 123-131.
Fazli Can et al.Concepts of the Cover Coefficient-Based Clustering MethodologySIGIR 1985 pp. 204-211.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and apparatus for incremental computation of the... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method and apparatus for incremental computation of the..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for incremental computation of the... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3654866

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.