Data processing: database and file management or data structures – Database design – Data structure types
Reexamination Certificate
2001-05-24
2004-12-07
Kindred, Alford (Department: 2177)
Data processing: database and file management or data structures
Database design
Data structure types
C707S793000, C707S793000, C707S793000
Reexamination Certificate
active
06829605
ABSTRACT:
BACKGROUND OF THE INVENTION
The present invention relates to information retrieval. In particular, the present invention relates to using logical forms in information retrieval.
Information retrieval systems have been developed to help users search through vast collections of documents to find a set of documents that are relevant to a search query. Initial information retrieval systems relied on the search query being in the form of a Boolean expression with keywords of the query linked together by Boolean operators. However, such Boolean expressions are difficult to formulate and require a level of expertise that is beyond most users.
Eventually, information retrieval systems were developed that allowed users to enter queries as natural language statements. In general, there are two types of natural language systems. The first type identifies words in the user's query and searches for these words in a word index. Documents that match these words are ranked and returned based, for example, on the frequency with which the terms appear in the documents.
In a second type of natural language system, semantic parsers are used to identify a semantic structure of both documents and queries, known as a logical form. Logical forms are used to construct an index representing the semantic structure of sentences in the documents of the collection. Documents that match the logical form of the query are returned to the user. An example of such a system is shown in U.S. Pat. No. 5,933,822, issued to the assignee of the present application on Aug. 3, 1999, and entitled “APPARATUS AND METHODS FOR AN INFORMATION RETRIEVAL SYSTEM THAT EMPLOYS NATURAL LANGUAGE PROCESSING OF SEARCH RESULTS TO IMPROVE OVERALL PRECISION.”
The performance of information retrieval systems is assessed in terms of recall and precision. Recall measures how well the information retrieval system performs in locating all of the documents in the collection that are relevant. A system that returns all of the documents in a collection has perfect recall. Precision measures the systems ability to select only documents that are relevant. Thus, a system that returns all of the documents in a collection has poor precision because it returns a large number of documents that are irrelevant.
Although retrieval systems that use logical forms generally have improved precision over keyword-based searches, there is an ongoing need for improved precision in information retrieval.
SUMMARY OF THE INVENTION
A method and apparatus are provided for improving the precision of information retrieval systems that use logical form searching techniques. Under one embodiment of the invention, several logical form triples, which represent selected portions of the logical form, are produced from the user's query and are combined together by restrictive logical operators to generate a compound logical form query. A search is then performed to find documents that meet the requirements set by the compound logical form query. In other embodiments, results generated by a logical form search are intersected with results from a word search to form a more precise set of results.
In further embodiments of the invention, three pairs of search results are intersected with each other to form three sets of final results. These final results are then ranked based on the techniques used to form their constituent result pairs. In one particular embodiment, results of an important word search are combined with the results of a compound logical form query to form a first set of final results. A second set of final results are formed by intersecting the important word search results with the results of a standard logical form triple search. The second set of final results are further intersected with the results of an ordinary word search to form a third set of final results. The three sets of final results are then ordered.
REFERENCES:
patent: 5309359 (1994-05-01), Katz et al.
patent: 5438511 (1995-08-01), Maxwell et al.
patent: 5515488 (1996-05-01), Hoppe et al.
patent: 5794050 (1998-08-01), Dahlgren et al.
patent: 5873077 (1999-02-01), Kanoh et al.
patent: 5893104 (1999-04-01), Srinivasan et al.
patent: 5933822 (1999-08-01), Braden-Harder et al.
patent: 5963940 (1999-10-01), Liddy et al.
patent: 5966126 (1999-10-01), Szabo
patent: 5987457 (1999-11-01), Ballard
patent: 6076051 (2000-06-01), Messerly et al.
patent: 6161084 (2000-12-01), Messerly et al.
patent: 6205443 (2001-03-01), Evans
patent: 6246977 (2001-06-01), Messerly et al.
patent: 6263328 (2001-07-01), Coden et al.
patent: 6393428 (2002-05-01), Miller et al.
patent: 6553372 (2003-04-01), Brassell et al.
patent: 6675159 (2004-01-01), Lin et al.
Y. Alp Aslandogan et al., “Techniques and Systems for Image and Video Retrieval,” IEEE TKDE Special Issue3 on Multimedia Retrieval, pp. 1-19 (Jan. 1999).
M.H. Heine et al., “An Investigation of the Optimization of Search Logic for the MEDLINE Database,” Journal of the American Society for Information Science, vol. 42, No. 4, pp. 267-278 (May 1991).
D.L. Pape et al., “STATUS With IQ-Escaping From the Boolean Straitjacket,” Program, vol. 22, No. 1, pp. 32-43 (Jan. 1988).
G. Salton et al., “Automatic Query Formulations in Information Retrieval,” Journal of the American Society for Information Science, vol. 34, No. 4, pp. 262-280 (Jul. 1983).
R.S. Marcus, “Search Aids in a Retrieval Network,” Communicating Information. Proceedings of the 43rdASIS Annual Meeting. vol. 17, pp. 394-396 (Oct. 5-10, 1980).
E.J. Guglielmo et al., “Overview of Natural Language Processing of Captions for Retrieving Multimedia Data,” Proceedings of 3rdConference on Applied Natural Language Processing, pp. 231-232 (Mar. 31-Apr. 3, 1992).
G. Salton, “A Simple Blueprint for Automatic Boolean Query Processing,” Information Processing & Management, vol. 24, No. 3, pp. 269-280 (1988).
N. Milic-Frayling et al., “CLARIT Compund Queries and Constraint-Controlled Feedback in TREC-5 Ad-hoc Experiments,” Fifth Text Retrieval Conference (TREC-5), pp. 315-333 (1997). Publisher: Nat. Inst. Standards & Technology.
R.C. Bodner et al., “Knowledge-Based Approaches to Query Expansion in Information Retrieval,” Advances in Artificial Intelligence. 11thBiennial Conference of the Canadian Society for Computational Studies of Intelligences, AI'96, pp. 146-158 (1996).
R. Evans, “Beyond Boolean: Relevance Ranking, Natural Language and the New Search Paradigm,” Proceedings of National Online Meeting, pp. 121-128 (May 10-12, 1994).
S. Koskiala et al., “Natural Language Access to Free-Text Databases,” Information*Knowledge*Evolution, Proceedings of the 44thFID Congress, p. 153-163 (1988).
D. Elworthy, “Question Answering Using a Large NLP System,” Proceedings of the 9thText Retrieval Conference (TREC-9), NIST Special Publications, pp. 355-360 (Nov. 13-16, 2000).
J. Prager et al., “One Search Engine or Two for Question-Answering,” Proceedings of the 9thText Retrieval Conference (TREC-9), NIST Special Publications, pp. 235-240 (Nov. 13-16, 2000).
E. Voorhees, “The TREC-8 Question Answering Track Report,” Proceedings of the 8thText Retrieval Conference (TREC-8), NIST Special Publication, pp. 77-82 (Nov. 17-19, 1999).
Kindred Alford
Magee Theodore M.
Microsoft Corporation
Westman Champlin & Kelly P.A.
Wong Leslie
LandOfFree
Method and apparatus for deriving logical relations from... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and apparatus for deriving logical relations from..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for deriving logical relations from... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3272433