Data processing: database and file management or data structures – Database design – Data structure types
Reexamination Certificate
2001-04-02
2004-02-03
Homere, Jean R. (Department: 2177)
Data processing: database and file management or data structures
Database design
Data structure types
C707S793000, C707S793000, C707S793000
Reexamination Certificate
active
06687695
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates in general to a relational database management system, and in particular, to SQL-based analytic algorithms that provide statistical and machine learning methods to create analytic models from the data residing in a relational database.
2. Description of Related Art
Relational databases are the predominate form of database management systems used in computer systems. Relational database management systems are often used in so-called “data warehouse” applications where enormous amounts of data are stored and processed. In recent years, several trends have converged to create a new class of data warehousing applications known as data mining applications. Data mining is the process of identifying and interpreting patterns in databases, and can be generalized into three stages.
Stage one is the reporting stage, which analyzes the data to determine what happened. Generally, most data warehouse implementations start with a focused application in a specific functional area of the business. These applications usually focus on reporting historical snap shots of business information that was previously difficult or impossible to access. Examples include. Sales Revenue Reporting, Production Reporting and Inventory Reporting to name a few.
Stage two is the analyzing stage, which analyzes the data to determine why it happened. As stage one end-users gain previously unseen views of their business, they quickly seek to understand why certain events occurred; for example a decline in sales revenue. After discovering a reported decline in sales, data warehouse users will then obviously ask, “Why did sales go down?” Learning the answer to this question typically involves probing the database through an iterative series of ad hoc or multidimensional queries until the root cause of the condition is discovered. Examples include Sales Analysis, Inventory Analysis or Production Analysis.
Stage three is the predicting stage, which tries to determine what will happen. As stage two users become more sophisticated, they begin to extend their analysis to include prediction of unknown events. For example, “Which end-users are likely to buy a particular product”, or “Who is at risk of leaving for the competition?” It is difficult for humans to see or interpret subtle relationships in data, hence as data warehouse users evolve to sophisticated predictive analysis they soon reach the limits of traditional query and reporting tools. Data mining helps end-users break through these limitations by leveraging intelligent software tools to shift some of the analysis burden from the human to the machine, enabling the discovery of relationships that were previously unknown.
Many data mining technologies are available, from single algorithm solutions to complete tool suites. Most of these technologies, however, are used in a desktop environment where little data is captured and maintained. Therefore, most data mining tools are used to analyze small data samples, which were gathered from various sources into proprietary data structures or flat files. On the other hand, organizations are beginning to amass very large databases and end-users are asking more complex questions requiring access to these large databases.
Unfortunately, most data mining technologies cannot be used with large volumes of data. Further, most analytical techniques used in data mining are algorithmic-based rather than data-driven, and as such, there are currently little synergy between data mining and data warehouses. Moreover, from a usability perspective, traditional data mining techniques are too complex for use by database administrators and application programmers, and are too difficult to change for a different industry or a different customer.
Thus, there is a need in the art for data mining applications that directly operate against data warehouses, and that allow non-statisticians to benefit from advanced mathematical techniques available in a relational environment.
SUMMARY OF THE INVENTION
To overcome the limitations in the prior art described above, and to overcome other limitations that will become apparent upon reading and understanding the present specification, the present invention discloses a method, apparatus, and article of manufacture for performing data mining applications in a relational database management system. At least one analytic algorithm is performed by a computer directly against a relational database, wherein the analytic algorithm includes SQL statements performed by the relational database management system and optional programmatic iteration, and the analytic algorithm creates at least one analytic model within an analytic logical data model from data residing in the relational database.
An object of the present invention is to provide more efficient usage of parallel processor computer systems. An object of the present invention is to provide a foundation for data mining tool sets in relational database management systems. Further, an object of the present invention is to allow data mining of large databases.
REFERENCES:
patent: 5412806 (1995-05-01), Du et al.
patent: 5590322 (1996-12-01), Harding et al.
patent: 5701400 (1997-12-01), Amado
patent: 5710915 (1998-01-01), McElhiney
patent: 5713014 (1998-01-01), Durflinger et al.
patent: 5734887 (1998-03-01), Kingberg et al.
patent: 5787413 (1998-07-01), Kauffman et al.
patent: 5787425 (1998-07-01), Bigus
patent: 5799310 (1998-08-01), Anderson et al.
patent: 5806066 (1998-09-01), Golshani et al.
patent: 5809238 (1998-09-01), Greenblatt et al.
patent: 5895465 (1999-04-01), Guha
patent: 6032146 (2000-02-01), Chadha et al.
patent: 6108004 (2000-08-01), Medl
patent: 6115704 (2000-09-01), Olson et al.
patent: 6301575 (2001-10-01), Chadha et al.
patent: 6477538 (2002-11-01), Yaginuma et al.
Sarawagi et al., “Intergrating Association Rule Mining with Relational Database System: Alternatives and Implications”, Proceeding of the 1998 ACM SIGMOD international conference on Management of data, May 1998, pp. 343-354.*
Venkatrao et al., “SQL/CLI—A New Binding Style for SQL”, ACM SIGMOD Record, vol. 24, Issue 4, Dec. 1995, p. 72-77.*
John, George, “Enhancements to the Data Mining Process”, a Dissertation for the degree of Doctor of Philosophy, Standford University, USA, Mar. 1997, 194 pages.*
G. Graefe et al., “On the Efficient Gathering of Sufficient Statistics for Classification from Large SQL Database,” Microsoft Corporation, Abstract, © 1998, 5 pages.
P.S. Bradley et al., “Scaling EM (Expectation-Maximization) Clustering to Large Databases,” Microsoft Corporation, Technical Report, Feb. 1999, 21 pages.
Anand Tej
Brye Todd Michael
Hildreth James Dean
Miller Timothy Edward
Pricer James Edward
Gates & Cooper LLP
Homere Jean R.
NCR Corporation
Pham Khanh
LandOfFree
SQL-based analytic algorithms does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with SQL-based analytic algorithms, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and SQL-based analytic algorithms will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3341425