Decision tree data structure for use in case-based reasoning

Data processing: artificial intelligence – Knowledge processing system – Knowledge representation and reasoning technique

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C706S053000, C705S002000

Reexamination Certificate

active

06704719

ABSTRACT:

FIELD OF THE INVENTION
The invention is generally related to computers and computer software. In particular, the invention is related to case-based reasoning and decision tree data structures for use therewith.
BACKGROUND OF THE INVENTION
Case-based reasoning is but one of a number of types of computer analysis approaches for drawing conclusions from input data. Case-based reasoning typically uses a decision tree to “prune” a library of past cases, also referred to herein as a “search space.” A decision tree is created by inductive reasoning,which draws generalizations from past data and applies those generalizations to new data to draw specific conclusions about the new data. Inductive reasoning is the complement of deductive reasoning, where responses to input data are developed from known general principles.
Case-based reasoning typically relies upon nearest-neighbor matching to attempt to predict a result for an unknown case based upon the results of past cases stored in a search space or library. As an example, case-based reasoning may be used by a bank to predict the likelihood that a particular customer would default on a loan, and thus whether a loan should be approved. Cases within a search space might include information such as the anticipated monthly payment, the length of time that a customer was employed at a certain job, the customer's monthly income, etc. Also, for each case in the library, an indication of whether that customer eventually defaulted on his or her loan would also be provided for each case. Then, whenever a new customer was presented to the bank, information about that customer could be presented as an unknown case, with nearest-neighbor matching used to locate those cases in the library that most closely resembled the data associated with the new case. Then, based upon whether those nearest-neighbor cases resulted in defaults, a determination could be made as to whether a loan should be approved for the new customer.
One difficulty associated with nearest-neighbor matching in case-based reasoning is the fact that nearest-neighbor matching can be extremely computationally intensive, particularly when a large number of cases exist in a library and a large number of characteristics, or attributes, need to be analyzed for each case. For this reason, often a logical construct known as a decision tree is utilized to narrow the search space with which nearest-neighbor matching is performed during case-based analysis of an unknown case. A decision tree is typically stored in a decision tree data structure, and is essentially used to prune a search space into a smaller subset of cases most likely to be relevant to an unknown case.
A conventional decision tree typically includes a collection of decision nodes arranged into a tree data structure, thus defining a plurality of paths that each identify different subsets of the cases from a search space. At each decision node, a test question is provided that queries a particular attribute of an unknown case and selects one of a plurality of test answers based upon the result of the query. Associated with each test answer is either a reference to another “child” decision node, from which another relevant query is performed, or a “leaf” node, which identifies a subset of cases from the search space, and which represents the end, or termination point, for a particular path in the decision tree. As such, a unique path is defined in the decision tree for each unique combination of test answers to the test questions presented in the decision tree, such that a relevant subset of cases may be identified for each combination of test answers.
By “pruning” the search space in this manner with a decision tree, the most likely subset of cases in the search space are quickly identified, so nearest-neighbor matching can then be performed on a smaller number of cases. As a result, case-based analysis may be performed significantly more quickly and with generally comparable results to those generated without the use of a decision tree.
The accuracy of a case-based reasoning system that incorporates a decision tree, however, can be significantly impacted by the manner in which a decision tree partitions a search space. As a result, a significant amount of effort has been directed to the automated generation of decision trees and the arrangement of decision nodes and test queries therein to maximize the accuracy of a decision tree.
One problem associated with the use of decision trees, in particular, stems from the relatively dynamic nature of case-based reasoning analysis. In particular, a case-based reasoning system is only as good as the data provided to the system, and it is therefore desirable to update a case library relatively frequently to build a comprehensive and current library with which nearest-neighbor matching may be performed. However, given that conventional decision trees store specific identifiers to the cases that match each path in the decision tree, anytime a new case is added to a case library or search space, the decision tree used to access that library will typically need to be regenerated. Generating a decision tree is computationally expensive, however, and as such, whenever a case library is updated, case matching cannot proceed until the decision tree is modified in view of the cases in the updated case library. As a consequence, for frequently updated libraries, system availability may be adversely impacted by the need to frequently regenerate the decision trees associated with such libraries.
Therefore, a significant need exists in the art for a manner of increasing the availability of a case-based analysis system, and in particular, for a manner of reducing the need to update decision trees utilized in such systems.
SUMMARY OF THE INVENTION
The invention addresses these and other problems associated with the prior art by providing an apparatus, computer-readable medium and method for use in association with case-based reasoning and the like that utilize a novel decision tree data structure. The data structure incorporates a search criterion in association with each test answer to a test criterion defined within a decision node, for use in selecting cases from a search space that match the associated test answer to the test criterion. Rather than storing identifiers to the actual cases in a case library, or search space, within a decision tree data structure, search criteria are used to provide the mechanism by which those cases that represent the nearest-neighbors for each path of the decision tree data structure can by dynamically selected.
Among other benefits, associating search criteria with test answers within a decision tree data structure takes advantage of the fact that the partitioning of a search space on a relatively coarse level, as is done with a decision tree data structure, typically does not require complete synchronization and currency with respect to a search space. As such, the utilization of search criteria in lieu of actual case identifiers eliminates the need to regenerate a decision tree after each modification (e.g., the addition of a new case) to the search space. While it still may be desirable in some embodiments to regenerate a decision tree data structure from time to time, the need to do so is significantly reduced, thereby increasing the availability of a case-based reasoning system for analyzing unknown cases.
Consistent with one aspect of the invention, a method is provided for applying case-based reasoning on an unknown case. The method includes traversing a path among a plurality of paths defined in a decision tree data structure to identify a subset of cases from a search space suitable for performing nearest-neighbor matching on the unknown case. Each path includes a plurality of decision nodes, and each decision node includes a test criterion defining a plurality of test answers. Each test answer has associated therewith a search criterion that selects cases in the search space that match the associated test answer. In addition, traversing the path inc

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Decision tree data structure for use in case-based reasoning does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Decision tree data structure for use in case-based reasoning, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Decision tree data structure for use in case-based reasoning will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3220486

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.