Data processing: database and file management or data structures – Database design – Data structure types
Patent
1998-04-07
2000-04-11
Black, Thomas G.
Data processing: database and file management or data structures
Database design
Data structure types
707 2, 707 3, 707 5, 707101, 707104, 705 27, G06F 1730
Patent
active
060497971
ABSTRACT:
The present invention relates to a computer method, apparatus and programmed medium for clustering databases containing data with categorical attributes. The present invention assigns a pair of points to be neighbors if their similarity exceeds a certain threshold. The similarity value for pairs of points can be based on non-metric information. The present invention determines a total number of links between each cluster and every other cluster bases upon the neighbors of the clusters. A goodness measure between each cluster and every other cluster based upon the total number of links between each cluster and every other cluster and the total number of points within each cluster and every other cluster is then calculated. The present invention merges the two clusters with the best goodness measure. Thus, clustering is performed accurately and efficiently by merging data based on the amount of links between the data to be clustered.
REFERENCES:
patent: 5325466 (1994-06-01), Kornacker
patent: 5675791 (1997-10-01), Bhide et al.
patent: 5706503 (1998-01-01), Poppen et al.
patent: 5710915 (1998-01-01), McElhiney
patent: 5790426 (1998-08-01), Robinson
patent: 5839105 (1998-11-01), Ostendorf
patent: 5861891 (1999-01-01), Becker
patent: 5884282 (1999-03-01), Robinson
patent: 5884305 (1999-03-01), Kleinberg
patent: 5960435 (1999-09-01), Rathmann
patent: 5983224 (1999-11-01), Singh
Jeffrey Scott Vitter. Random Sampling With A Reservoir. ACM Transactions on Mathematical Software, 11(1):37-57, 1985.
Martin Ester, et al. A Density-Based Algorithm For Discovering Clusters In Large Spatial Database With Noise. In International Conference on Knowledge Discovery in Databases and Data Mining (KDD-96), Montreal, Canada, Aug. 1996.
Tian Zhang, et al. Birch: An Efficient Data Clustering Method For Very Large Databases. In Proceedings of the ACM SIGMOD Conference on Management of Data, pp. 103-114, Montreal, Canada, Jun. 1996.
Eui-Hong Han, et al. Clustering Based On Association Rule Hypergraphs. Technical report, 1997 SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery, Jun. 1997.
Martin Ester, et al. A Database Interface For Clustering In Large Spatial Databases. In International Conference on Knowledge Discovery in Databases and Data Mining (KDD-95), Montreal, Canada, Aug. 1995.
Raymond T. Ng, et al. Efficient And Effective Clustering Methods For Spatial Data Mining. In Proc. of the VLDB Conference, Santiago, Chile, Sep. 1994.
Guha Sudipto
Rastogi Rajeev
Shim Kyuseok
Black Thomas G.
Lucent Technologies - Inc.
Mizrahi Diane D.
LandOfFree
Method, apparatus and programmed medium for clustering databases does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method, apparatus and programmed medium for clustering databases, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method, apparatus and programmed medium for clustering databases will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-1184150