Data processing: database and file management or data structures – Database design – Data structure types
Reexamination Certificate
2002-12-10
2004-03-16
Breene, John E. (Department: 2177)
Data processing: database and file management or data structures
Database design
Data structure types
C715S252000
Reexamination Certificate
active
06708165
ABSTRACT:
FIELD OF THE INVENTION
The field of the invention relates to document retrieval and more particularly to search engines operating within the context of a database.
BACKGROUND OF THE INVENTION
Automated methods of searching databases are generally known. For example, P. G. Ossorio developed a technique for automatically measuring the subject matter relevance of documents (Ossorio, 1964, 1966, 1968, 1969). The Ossorio technique produced a quantitative measure of the relevance of the text with regard to each of a set of distinct subject matter fields. These numbers provided by the quantitative measure are the profile or information spectrum of the text. H. J. Jeffrey produced a working automatic document retrieval system using Ossorio's technique (Jeffrey, 1975, 1991). The work by Ossorio and Jeffrey showed that the technique can be used to calculate the information spectra of documents, and of requests for information, and that the spectra can be effective in retrieving documents.
However, Ossorio's technique was designed to solve a particular kind of document retrieval problem (i.e., fully automatic retrieval with complete cross-indexing). As a result, the technique has certain characteristics that make it unusable for information retrieval in cases in which there is a very wide range of subject matter fields, such as the Internet.
SUMMARY
In general, in one aspect, the invention features a method for processing information. The method includes receiving a segmented judgment matrix and using the segmented judgment matrix to calculate an information spectrum. The segmented judgment matrix is a numerical matrix pairing each of a set of terms to each of a set of classifications where each term is a word or phrase. The segmented judgment matrix includes information submatrices with each element of each information submatrix representing a rating of a relevance of the term of the element to the classification of the element. Each information submatrix is a numerical matrix representing the relevance of each of a subset of the set of terms to each of a subset of the set classifications.
In some implementations, at least some of the elements of the information submatrices represent ratings of relevance made by a human being. The segmented judgment matrix may include rows and columns, with each column of the segmented judgment matrix representing a classification and each row of the segmented judgment matrix representing a term.
The method for processing information may further include receiving a search request, using the segmented judgment matrix to calculate an information spectrum of the search request, and using the segmented judgment matrix to calculate an information spectrum for each of a plurality of documents. The calculated information spectrums then may be compared to identify at least some documents of the plurality of documents as relevant to the search request. In some implementations, each information submatrix includes a plurality of classifications and a plurality of terms relevant to each classification. In such implementations, the information spectrums are calculated based upon at least some of the plurality of terms. The plurality of terms may be selected based upon a relevance of each term of the plurality of terms to at least some of the classifications of the information submatrices.
The step of calculating an information spectrum for each document and for the search request may include determining a log average among the ratings of relevance of the terms for each classification. The information spectrums for each document may be compared by determining a distance between the information spectrum of the at least some documents and the information spectrum of the search request.
In some implementations, the method for processing information further includes selecting a document of the identified documents as definitely relevant to the search request. The method for processing information may use the calculated information spectrum for the selected document to form a new search request. Some implementations also may allow zooming in on a portion of a document information spectrum. The method may determine that a document and request have a wide spectrum with significant content in a field F of a term and measuring the request and document using a subengine for field F.
In another general aspect, a computer program product includes instructions operable to cause data processing apparatus to receive a segmented judgment matrix and use the segmented judgment matrix to calculate an information spectrum.
The details of one or more implementations are set forth in the accompanying drawings and the description below. Other features will be apparent from the description and drawings, and from the claims.
REFERENCES:
patent: 5781879 (1998-07-01), Arnold et al.
patent: 5787422 (1998-07-01), Tukey et al.
patent: 5835722 (1998-11-01), Bradshaw et al.
patent: 5845278 (1998-12-01), Kirsch et al.
patent: 5873056 (1999-02-01), Liddy et al.
patent: 5926812 (1999-07-01), Hilsenrath et al.
patent: 6493711 (2002-12-01), Jeffrey
James D. Johannes, Automatic Thyroid Diagnosis VIA Simulation of Physician Judgement, Aug., 1977, i-viii, 1-156.*
Chu et al., ‘Application of fuzzy multiple attribute decision making on company analyis for stock selection’, Dec., 1996, pp. 509-514.*
Mon, Don-Lin, ‘Evaluating weapon system using fuzzy analytic hierarchy process based on entropy weight’, Mar., 1995, pp. 591-598, vol.2.*
McMeekin, G.C., ‘An estimation procedure to detect and remove unintentional judgemental bias (UJB) in the analaytic hierarchy process methodology’, Oct. 1994, pp. 241-248.*
Johannes, James Donald,Automatic Thyroid diagnosis Via Simulation of Physician Judgment, Ph.D. Dissertation, Vanderbilt University (Aug., 1977).
Kurtz, M.J., et al.,Intelligent Text Retrieval in the NASA Astrophysics Data System, Center for Astrophysics Preprint Series No. 3536 (Harvard-Smithsonian Center for Astrophysics [no date]).
Ossorio, Peter G.Attribute Space Development and Evaluation, Technical Report No. RADC-TR-67-640 (Jan., 1968).
Ossorio, Peter G.,Classification Space: A Multivariant Procedure for Automatic Document Indexing and Retrieval, 1 Multivariate Behavioral Research 479-524 (Oct., 1966).
Assorio, Peter G.,Classification Space Analysis, Technical Report No. RADC-TDR-64-287 (Ocotber, 1964).
Jeffrey, Joel,An Information Retrieval System Based on Semantic Measurement, Systems & Information Science Technical Report, Vanderbilt University (1975).
Jeffrey, H. Joel,Expert Document Retrieval Via Semantic Measurement, 2 Expert Systems With Applications 345-352 (1991).
Jeffrey, H. Joel,Judgment-Simulation Vector Spaces, Ch. 13, Advances in Computer Methods for Systematic Biology: Artificial Intelligence, Database, ComputerVision, ed. Renaud Fortuner (Baltimore: Johns Hopkins University Press, 1993).
Ali Mohammad
Breene John E.
H5 Technologies, Inc.
LandOfFree
Wide-spectrum information search engine does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Wide-spectrum information search engine, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Wide-spectrum information search engine will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3241733