Architecture for distributed relational data mining systems

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S793000

Reexamination Certificate

active

06687693

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates to data mining systems, and in particular, to an architecture for distributed relational data mining systems.
2. Description of Related Art
Often, computer-implemented systems are used to analyze commercial and financial transaction data. In many instances, such data is analyzed to gain a better understanding of customer behavior by analysis of customer transactions.
Prior art methods for analyzing customer transactions often involve one or more of the following techniques:
1. Ad hoc querying: This methodology involves the iterative analysis of transaction data by human effort, using querying languages such as SQL.
2. On-line Analytical Processing (OLAP): This methodology involves the application of automated software front-ends that automate the querying of relational databases storing transaction data and the production of reports therefrom.
3. Statistical packages: This methodology requires the sampling of transaction data, the extraction of the data into flat file or other proprietary formats, and the application of general purpose statistical or data mining software packages to the data.
However, these prior techniques have serious shortcomings that represent significant impediments to their use and important flaws in the design of analytical architectures. Of key importance is that prior art techniques do not work well with large databases, because such schemes do not consider memory limitations and do not account for large data sets. Thus, there is a need in the art for improved techniques for implementing data mining systems, especially architectures that handle large amounts of data.
SUMMARY OF THE INVENTION
A computer-implemented data mining system includes an Interface Tier, an Analysis Tier, and a Database Tier. The Interface Tier supports interaction with users, and includes an On-Line Analytic Processing (OLAP) Client that provides a user interface for generating SQL statements that retrieve data from a database, and an Analysis Client that displays results from a data mining algorithm. The Analysis Tier performs one or more data mining algorithms, and includes an OLAP Server that schedules and prioritizes the SQL statements received from the OLAP Client, an Analytic Server that schedules and invokes the data mining algorithm to analyze the data retrieved from the database, and a Learning Engine performs a Learning step of the data mining algorithm. The Database Tier stores and manages the databases, and includes an Inference Engine that performs an Inference step of the data mining algorithm, a relational database management system (RDBMS) that performs the SQL statements against a Data Mining View to retrieve the data from the database, and a Model Results Table that stores the results of the data mining algorithm.


REFERENCES:
patent: 5566330 (1996-10-01), Sheffield
patent: 5761656 (1998-06-01), Ben-Shachar
patent: 5787425 (1998-07-01), Bigus
patent: 5909681 (1999-06-01), Passera et al.
patent: 5970482 (1999-10-01), Pham et al.
patent: 6385604 (2002-05-01), Bakalash et al.
patent: 6408292 (2002-06-01), Bakalash et al.
patent: 6418450 (2002-07-01), Daudenarde
Microsoft Computer Dictionary, p. 144.*
C. Aggarwal et al., “Fast Algorithms for Projected Clustering,” In Proceedings of the ACM SIGMOD Int'l Conf on Management of Data, Philadelphia, PA, 1999.
R. Agrawal et al., “Automatic Subspace Clustering of High . . . Applications,” In Proceedings of ACM SIGMOD Int'l Conf on Management of Data, Seattle, WA, 1998.
H. Bozdogan, “Model selection and Akaike's information criterion . . . extensions,” Psychometrika, 52(3):345-370, 1987.
P.S. Bradley et al., “Scaling Clustering Algorithms to Large Databases,” In Proceedings of the Int'l Knowledge Discovery and Data Mining Conference (KDD), 1998.
P.S. Bradley et al., “Scaling EM (Expectation-Maximization) Clustering to Large Databases,” Microsoft Research Technical Report, 20 pages, 1998.
A.P. Dempster et al., “Maximum Likelihood from Incomplete Data via the EM Algorithm,” Journal of The Royal Statistical Society, 39(1):1-38, 1977.
M. Ester et al., “A Density-Based Algorithm for Discovering . . . Noise,” In Proceedings of the IEEE, Int'l Conf on Data Engineering (ICDE), Portland, Oregon, 1996.
G. Graefe et al., “On the Efficient Gathering . . . Databases,” Microsoft, AAAI, 5 pages, 1998.
A. Hinneburg et al., “Optimal Grid-Clustering: Towards Breaking the Curse . . . Clustering,” In Proceedings of the 25thInt'l Conf on Very Large Data Bases, Edinburgh, Scotland, 1999.
M.I. Jordan et al., “Hierarchical Mixtures of Experts and the EM Algorithm,” Neural Computation, 6:181-214, 1994.
F. Murtagh, “A Survey of Recent Advances in Hierarchical Clustering Algorithms,” The Computer Journal, 26(4):354-359, 1983.
R.T. Ng et al., “Efficient and Effective Clustering Methods . . . Mining,” In Proc. of the VLDB Conf, Santiago, Chile, 1994.
W.H. Press et al., “Numerical Recipes in C,” Cambridge University Press: Cambridge, 20 pgs., 1986.
S. Roweis, “A Unifying Review of Linear Gaussian Models,” Neural Computation, 11:305-345, 1999.
T. Zhang et al., “BIRCH: An Efficient Data Clustering . . . Databases,” Int'l Proc of the ACM SIGMOD Conference, Montreal, Canada, pp. 103-114, 1996.
A White Paper Prepared by MicroStrategy, Inc., “The Case for Relational OLAP,” 20 pages, 1995.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Architecture for distributed relational data mining systems does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Architecture for distributed relational data mining systems, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Architecture for distributed relational data mining systems will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3325684

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.