Information retrieval using dynamic evidence combination

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Reexamination Certificate

active

06269368

ABSTRACT:

BACKGROUND OF THE INVENTION
The present invention relates generally to the field of computer-based information retrieval, and more specifically to a system and method for improving information retrieval effectiveness by dynamically combining evidence information produced by a plurality of retrieval systems matching alternative representations of queries and documents.
Computer-based information retrieval is now an established industry serving many professional communities. Retrieval technologies used in this industry share many common features. For example, a user of these systems is typically required to either (1) state an information need, or query, in a circumscribed manner, usually by denoting the logical requirements of the query as a sequence of terms linked by various operators, or (2) write the query as free-form text, which is then parsed automatically into a sequence of words or phrases, without regard for the logical form of the query or the underlying meaning of the query. In either event the query is represented only by the collection of words that are overtly stated in the query text (or limited stemmed forms of some words, such as plurals). The matching of documents to a query is based on the co-occurrence of these words or phrases.
A second commonality among retrieval systems is that a query representation derived from a user's query statement is automatically formed by the computer system, with limited or no interaction with the user. In most retrieval systems, once an initial query statement has been made in full, the computer system interprets the contents of the query without allowing the user to verify, clarify or expand upon query representations created by the computerized retrieval system. In the same fashion, the subsequent display of retrieved documents is largely under computer control, with little user interaction.
Further several techniques have been developed for retrieving desired items from a collection of several items to satisfy a user's information needs as expressed through the query. However, most of these retrieval techniques fail to provide a comprehensive solution to the information retrieval problem. Although each retrieval technique provides its own independent evidence to rate collection of retrieved items for their relevance to the user's query, no one approach has been successful in providing all the evidence. A common solution to overcome limitations of individual search techniques has been to combine the results of a plurality of search techniques into a single set of results. This is usually done using static or fixed combination functions such as adding the results of the different retrieval techniques. Although this may provide improvements over individual search techniques, it does not take into consideration that different queries may be best served using different combination rules.
In view of the above, there is a need for an information retrieval technique which increases the effectiveness and preciseness of information retrieval while combining the results of multiple retrieval approaches. Further, it is desirable that the information retrieval technique capture both the preciseness and richness of meaning in queries and documents and allow for user feedback to facilitate the retrieval process.
SUMMARY OF THE INVENTION
The present invention provides a system and method for improving information retrieval effectiveness by dynamically combining evidence information produced by a plurality of retrieval systems matching alternative representations of queries and documents.
In one aspect, the present invention, in contrast to conventional systems that combine multiple evidence sources in a fixed or static manner, performs dynamic evidence combination wherein the combination regime, used to combine individual match scores based on multiple evidence sources, is dynamically adjusted for different queries and document collections. In one embodiment, the dynamic modification of the combination regime is based on information such as query dependent information, retrieved documents specific information, score correlation information for documents retrieved using multiple retrieval approaches, and optionally user relevance judgment information.
According to another aspect of the present invention, the amount of correlation among the scores returned by the different matchers is determined. If two or more matchers provide strongly correlated sets of scores for the documents, these scores may in some sense be redundant, and hence may be weighted downwardly. Correlation information is helpful for predicting the optimal score combination regime for a given query.
According to still another aspect of the present invention, techniques are provided for generating sophisticated representations of the contents of both queries and documents by using natural language processing (NLP) techniques to represent, index, and retrieve texts at the multiple levels (e.g., the morphological, lexical, syntactic, semantic, discourse, and pragmatic levels) at which humans construe meaning in writing. The invention also offers the user the ability to interact with the system to confirm and refine the system's interpretation of the query content, both at an initial query processing step and after query matching has occurred.
According to a further aspect of the invention, the user enters a query, possibly a natural language query, and the system processes the query to generate alternative representations of the query. The alternative representations may include conceptual-level abstraction and enrichment of the query, and may include other representations. In a specific embodiment, the conceptual-level representation is a subject field code vector, while the other representations include one or more of representations based on complex nominals (CNs), proper nouns (PNs), single terms, text structure, and logical make-up of the query, including mandatory terms and negations. After processing the query, the system displays query information to the user, indicating the system's interpretation and representation of the content of the query. The user is then given an opportunity to provide input, in response to which the system modifies the alternative representation of the query. Once the user has provided desired input, the possibly modified representation of the query is matched to the relevant document database, and measures of relevance generated for the documents. The documents in the database have preferably been processed to provide corresponding alternative representations for matching to queries.
According to another aspect of the invention, a set of retrieved documents is presented to the user, who is given an opportunity to select some or all of the documents, typically on the basis of such documents being of particular relevance. The user then initiates the generation of a query representation based on the alternative representations of the selected document(s). To the extent that the set of documents were retrieved in response to a previous query, the alternative representations of the selected documents may be combined with the alternative representation of the previous query. Thus the user is able to improve on an initial query representation by re-expressing the query as a composite of the representations derived from documents deemed highly relevant by the user, possibly combined with the representation of the original query. The selected documents information may also be used to dynamically modify the evidence combination regime so as to improve the number of relevant documents retrieved upon a subsequent execution of the query.
According to a further aspect of the invention, texts (documents and queries) are processed to determine discourse aspects of the text beyond the subject matter of the text. This text structure includes temporal information (past, present, and future), and intention information (e.g., analysis, prediction, cause/effect). Thus the invention is able to detect the higher order abstractions that exist

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Information retrieval using dynamic evidence combination does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Information retrieval using dynamic evidence combination, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Information retrieval using dynamic evidence combination will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2548282

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.