Method and system for data mining automation in...

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S793000, C706S021000

Reexamination Certificate

active

06636860

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
The field of the invention is data processing, that is, methods and systems for financial, business practice, business management, or cost/price determinations.
2. Description of the Related Art
A data mining tool is computer software that analyzes data and discovers relationships, patterns, knowledge, or information from the data. Data mining is also referred to as knowledge discovery. Data mining tools attempt to solve the problem of users being overwhelmed by the volume of data collected by computers operating business applications generally and including particularly those for e-commerce. Data mining tools attempt to shield users from the unwieldy body of data by analyzing it, summarizing it, or drawing conclusions from the data that the user can understand. For example, one known computer software data mining product is IBM's “Intelligent Miner” which is operable in several computing environments including AIX, AS/400, OS/390, Windows NT, and Windows 2000, and Solaris. The IBM Intelligent Miner is an enterprise data mining tool, designed for client/server configurations and optimized to mine very large data sets, such as gigabyte data sets. The IBM Intelligent Miner includes a plurality of data mining techniques or tools, used to analyze large databases and provides visualization tools used to view and interpret the different mining results.
An analytic application is a software application that inputs historical data collected from a production system over time, analyzes this historical data, or samples of the historical data, and outputs the findings back to the production system to help improve its operation. For example, an e-commerce server that manages an internet shopping site is a production system, and an analytic application might use historical data collected from the e-commerce server to report on what type users are visiting the site and how many of these are actually buying products. The term “analytic application” is used throughout this specification to mean “analytic software application,” referring to a category of software typically understood to be used directly by end users to solve practical problems in their work.
Data mining is an important technology to be integrated into analytic applications. Data mining is data processing technology, combinations of hardware and software, that dynamically discover patterns in historical data records and applies properties associated with these records (e.g., likely to buy) to production data records that exhibit similar patterns. Use of data mining typically involves steps such as identifying a business problem to be solved, selecting a mining algorithm useful to solve the business problem, defining data schema to be used as inputs and outputs to and from the mining algorithm, defining data mining models based upon the defined data schema, populating input data schema with historical data, training the data mining model based upon the historical data, and scoring historical data or production data by use of the model.
Analytic applications typically function in a general cycle in which historical data is collected from a production system over time, historical data, or samples of historical data, are analyzed, and findings are output back to the production system to help improve its operation. The quantities of data to be analyzed are large, and the computational demand is intense. The whole cycle is often executed at regular intervals, for example, once daily at night so that reports showing the analytic findings are available for review the next morning. There is an increasing demand, however, to do the analysis faster and more frequently so that the results on business performance are reported back within as little as a few hours, in some cases, as little as two or three hours, or even less. In fact, it appears that there is a trend in this area of technology to press for near real-time analytic reporting.
In prior art, however, with available data mining tools, the end user of an analytic application must be sufficiently skilled in data mining to accomplish all the tasks of data mining, some of which require substantial expertise in data mining. For applications such as e-commerce, which are being widely adopted by businesses of all sizes and in all commerce areas, it is difficult and expensive for every business using data mining to acquire substantial data mining expertise. It would be desirable and useful, therefore, for analytic applications to automate data mining so as to reduce the need for end users to have special expertise in data mining as such.
Until recently, it was impossible to automate the data mining cycle because the steps of identifying a business problem to be solved, selecting a mining algorithm useful to solve the business problem, defining data schema to be used as inputs for mining algorithms, and defining data mining models based upon the defined data schema required substantial expertise and individual human judgment brought to bear at an end user's location on an ad hoc, case-by-case basis. Recently, however, predefined data mining models have become available founded on previously identified business questions and associated data schema.
For a discussion of predefined data mining models, see the U.S. patent application Ser. No. 09/826,662 filed on Apr. 5, 2001, which is incorporated entirely by reference into this specification.
In analytic applications operating predefined data mining models, a set of business questions that are useful to end users are predefined and the data schema needed to answer these business questions are also predefined. The predefined data mining models for use in this technology are tested and shipped with a product, an analytic application, which is then production trained and applied automatically by end users without needing specialized data mining expertise.
A data mining model is usually defined to address a given business question based on a given data schema. Data mining tools such as IBM's “Intelligent Miner” are generic applications that are operated independently with respect to specific applications. Because such data mining tools in prior art did not include set business questions, predefined data schema, or predefined data mining models, end users would themselves need to analyze business questions, define data schema useful with respect to the questions, and define their own data mining models based upon the data schema. Developers of analytic applications incorporating data mining tools did not in prior art supply predefined data mining models. Without predefined data mining models, the data mining analytic cycle could not be automated.
Accordingly, in analytic applications using data mining tools, there is significant benefit in predefining data mining models whenever possible, as this will enable developers of analytic applications to develop analytic applications capable of automating data mining cycles so that end users may train and apply predefined data mining models with no need for specialized data mining expertise and with no need for end user intervention in data mining processes as such.
It is also true that in prior art, the often cyclic steps of populating data mining schema with historical data, training a data mining model by use of historical data, and scoring historical data or production data by use of the trained data mining model were steps requiring manual intervention. As a practical matter, manual intervention risks delays and missed schedules. There is a need in the art, therefore, for improved methods of data mining.
SUMMARY OF THE INVENTION
A principal aspect of the present invention is a method of automated data mining using a domain-specific analytic application for solving predefined business problems. Embodiments typically include populating input data schema, wherein said populating comprises reading input data from a data store and writing the input data to input data schema, the input data schema having a format appropriate to solution o

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and system for data mining automation in... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method and system for data mining automation in..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and system for data mining automation in... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3158652

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.