Method and system for data clustering for very large databases

Data processing: database and file management or data structures – Database design – Data structure types

Patent

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

707101, 382226, G06F 1518

Patent

active

058321829

ABSTRACT:
Multi-dimensional data contained in very large databases is efficiently and accurately clustered to determine patterns therein and extract useful information from such patterns. Conventional computer processors may be used which have limited memory capacity and conventional operating speed, allowing massive data sets to be processed in a reasonable time and with reasonable computer resources. The clustering process is organized using a clustering feature tree structure wherein each clustering feature comprises the number of data points in the cluster, the linear sum of the data points in the cluster, and the square sum of the data points in the cluster. A dense region of data points is treated collectively as a single cluster, and points in sparsely occupied regions can be treated as outliers and removed from the clustering feature tree. The clustering can be carried out continuously with new data points being received and processed, and with the clustering feature tree being restructured as necessary to accommodate the information from the newly received data points.

REFERENCES:
patent: 5040133 (1991-08-01), Feintuch et al.
patent: 5179643 (1993-01-01), Homma et al.
patent: 5263120 (1993-11-01), Bickel
patent: 5325466 (1994-06-01), Kormacker
patent: 5329596 (1994-07-01), Sakou et al.
patent: 5375175 (1994-12-01), Kino et al.
patent: 5404561 (1995-04-01), Castelaz
patent: 5423038 (1995-06-01), Davis
patent: 5424783 (1995-06-01), Wong
patent: 5440742 (1995-08-01), Schwanke
patent: 5448727 (1995-09-01), Annevelink
patent: 5555196 (1996-09-01), Asano
P. Cheeseman, et al., "AutoClass: A Bayesian Classification System," Proc. of the 5th Int'l. Conf. on Machine Learning, Morgan Kaufman, Jun. 1988, pp. 296-306.
R. Dubes, et al., "Clustering Methodologies in Exploratory Data Analysis," Advances in Computers, vol. 19, Academic Press, New York, 1980, pp. 113-228.
M. Ester, et al., "A Database Interface for Clustering in Large Spatial Databases," Proc. of 1st Int'l. Conf. on Knowledge Discovery and Data Mining, 1995.
M. Ester, et al., "Knowledge Discovery in Large Spatial Databases: Focusing Techniques for Efficient Class Identification," Proc. of 4th Int'l. Symposium on Large Spatial Databases, Portland, Maine, 1995, pp. 1-20.
D. Fisher, "Knowledge Acquisition Via Incremental Conceptual Clustering," Machine Learning, vol. 2, No. 2, 1987, pp. 139-172 (original publication).
D. Fisher, "Iterative Optimization and Simplification of Hierarchical Clusterings," Technical Report CS-95-01, 1995, Dept. of Computer Science, Vanderbilt University, Nashville, Tenn., pp. 1-33.
M. Lebowitz, "Experiments with Incremental Concept Formation: UNIMEM," Machine Learning, vol. 2, 1987, pp. 103-138.
R.C.T. Lee, "Clustering Analysis and Its Applications," Adv. In Info. Sys. Sci., vol. 8, 1987, pp. 169-292.
F. Murtagh, "A Survey of Recent Advances in Hierarchical Clustering Algorithms," The Computer Journal, 1983, pp. 354-359.
R. Ng, et al., "Efficient and Effective Clustering Methods for Spatial Data Mining," Proc. of 20th VLDB Conf., 1994, pp. 144-155.
C. Olson, "Parallel Algorithms for Hierarchical Clustering," Technical Report, Computer Science Division, University of California at Berkeley, Dec. 1993, pp. 1-24.
El Sherif et al., Pattern recognition using neural networks that learn from fuzzy rules Proceedings of the 37th Midwest symposium on circuits and systems, pp. 599-602, Aug. 5, 1994.
Cheng, Fuzzy clustering as blurring, Proceedings of the third IEEE conference on fuzzy systems, pp. 1830-1834, Jun. 29, 1994.
Matthews et al., Clustering without a metric, IEEE transactions on pattern analysis and Machine intelligence, pp. 175-184, Feb. 1991.
Kosaka et al., Tree-structured speaker clustering for fast speaker adaptation, ICASSP-94, pp. I/245-I/248, Apr. 22, 1994.
Frigui et al., Competitive fuzzy clustering, NAFIPS, pp. 225-228, Jun. 22, 1996.
Perrone, A novel recursive partitioning criterion, IJCNN-91, p. 989, Jul. 14, 1991.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and system for data clustering for very large databases does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method and system for data clustering for very large databases, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and system for data clustering for very large databases will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-699851

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.