Data processing: structural design – modeling – simulation – and em – Modeling by mathematical expression
Reexamination Certificate
1998-10-21
2001-07-31
Teska, Kevin J. (Department: 2123)
Data processing: structural design, modeling, simulation, and em
Modeling by mathematical expression
C703S022000, C706S021000, C707S793000, C345S440000
Reexamination Certificate
active
06269325
ABSTRACT:
BACKGROUND
This invention relates generally to data mining software.
Data mining software extracts knowledge that may be suggested by a set of data for various uses. For example, data mining software can be used to maximize a return on investment made in collecting marketing data as well as other applications such as credit risk management, process control, medical diagnosis and so forth. Typically, data mining software uses one or a plurality of different types of modeling algorithms in concert with a set of test data to determine what types of characteristics are most useful in achieving a desired response rate or behavioral response from a targeted group of individuals represented by the data. Generally, data mining software executes complex data modeling algorithms such as linear regression, logistic regression, back propagation neural network, Classification and Regression (CART) and Chi squared Automatic Interaction Detection (CHAID) decision trees, as well as, other types of algorithms on a set of data. The results obtained by executing these algorithms are typically conveyed to a decision maker in order to decide what type of model might be best for a particular use.
One technique which is used to convey this information to a decision maker is the use of a visual representation of model performance such as a lift chart or a receiver operating characteristic curve. A lift chart measures the ability of a model to rank order scores so that higher scores exhibit more of the model's attribute or behavior. Whereas, a receiver operating characteristic curve compares a percentage of hits to a percentage of false alarms produced by a model of behavior thereby providing a measure of the accuracy of a model.
In response modeling a lift chart can be used to visually describe which prospects are more likely to respond to a particular stimuli. For example, a lift chart can be used in a marketing promotion campaign to identify likely responders versus non-responders. Therefore in such an application, the X axis of a lift chart would represent file depth, or the fraction of all prospects that can be contacted, whereas the Y axis of would show a fraction of all responders that would be successfully targeted at a specified file depth.
A lift chart is typically referenced from a base line which is a line of the form y=x which indicates the average or expected performance of not using a model. That is, for example, when 30% of all prospects are contacted, 30% of all responders are expected to be reached. When 40% are contacted, 40% of responders are expected to be reached and so forth. An algorithm that sorts prospects by their propensity to perform in an expected behavioral manner will produce a result that can be plotted as a lift curve. A useful model will produce a lift curve that is above (i.e., exhibits lift) the diagonal reference line.
The lift over the diagonal reference line is the expected or predicted improvement generated by the model by intelligently targeting specific prospects based on model inputs. The model inputs will vary based upon the application to which the data mining software is being applied, as well as the nature of the algorithm type used in the model.
While the conventional lift chart is adequate to provide a visual depiction of predicted modeling behavior, the conventional lift chart may become inadequate when the data mining software executes a large plurality or type of algorithms or a large number of algorithms of the same type.
SUMMARY
According to an aspect of the present invention, a method for presenting measurements of modeling performance for a plurality of modeling algorithms includes displaying on an output display device a lift chart, the lift chart, having at least three lift curves, each of said three lift curves corresponding to a result obtained from executing one of the plurality of modeling executions with each lift curve being rendered on the output device using a different, visual indicia.
According to an additional aspect of the invention, a computer program product residing on a computer readable medium for displaying results of modeling of expected behavior from execution of a plurality of modeling algorithms that model the expected behavior, includes instructions for causing a computer to run the plurality of models on a set of test data to produce results of the expected behavior from the model and to produce data that can be converted into a measure of the expected behavior. The computer program also causes the computer to produce a visual representation of the results of modeling the expected behavior and render the results of modeling the expected behavior on an output device using different visual indicia to represent the different results of modeling the expected behavior.
One or more of the following advantages are provided by the aspects of the invention. Lift curves for different algorithms or algorithm types are color-coded or provided with other visual indications for easy identification to distinguish each curve. The user can change which algorithms are displayed in the chart and which colors or other visual indications are assigned to each algorithm. Also lift curves from a specific algorithm and lift curves representing different models can be superimposed in a single lift chart with the best performing model being highlighted in a specific color. Each of the other model instances can be displayed in a different color or provided with other visual indications. In addition, the invention allows the user to view the performance of a particular algorithm while varying other factors to understand how variance of other factors affects the performance of the algorithm. The invention permits a user to view the evolution of lift charts over time to determine if additional processing is necessary.
REFERENCES:
patent: 5692107 (1997-11-01), Simoudis et al.
patent: 5754738 (1998-05-01), Saucedo et al.
patent: 5809499 (1998-09-01), Wong et al.
patent: 5861891 (1999-01-01), Becker
patent: 5930803 (1999-07-01), Becker et al.
patent: 5966139 (1999-10-01), Anupam et al.
patent: 5999192 (1999-12-01), Selfridge et al.
patent: 6044366 (2000-03-01), Graffe et al.
Keim et al., “Supporting Data Mining of Large Databases by Visual Feedback Queries,” IEEE, 1994, pp. 302-313.*
Keim et al., “Visualization Techniques for Mining Large Databases: A Comparison,” IEEE, 1996, pp. 923-938.*
Lee et al., “Visualization Support for Data Mining,” IEEE, 1996, pp. 69-75.*
Mohamed Hababa, “Intelligent Hybrid System for data mining”, IEEE, pp. 111.
Kennedy Ruby
Lee Yuchun
Fish & Richardson P.C.
Phan Thai
Teska Kevin J.
Unica Technologies, Inc.
LandOfFree
Visual presentation technique for data mining software does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Visual presentation technique for data mining software, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Visual presentation technique for data mining software will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2472227