Method, apparatus and programmed medium for clustering databases

Data processing: database and file management or data structures – Database design – Data structure types

Patent

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

707 2, 707 3, 707 5, 707101, 707104, 705 27, G06F 1730

Patent

active

060497971

ABSTRACT:
The present invention relates to a computer method, apparatus and programmed medium for clustering databases containing data with categorical attributes. The present invention assigns a pair of points to be neighbors if their similarity exceeds a certain threshold. The similarity value for pairs of points can be based on non-metric information. The present invention determines a total number of links between each cluster and every other cluster bases upon the neighbors of the clusters. A goodness measure between each cluster and every other cluster based upon the total number of links between each cluster and every other cluster and the total number of points within each cluster and every other cluster is then calculated. The present invention merges the two clusters with the best goodness measure. Thus, clustering is performed accurately and efficiently by merging data based on the amount of links between the data to be clustered.

REFERENCES:
patent: 5325466 (1994-06-01), Kornacker
patent: 5675791 (1997-10-01), Bhide et al.
patent: 5706503 (1998-01-01), Poppen et al.
patent: 5710915 (1998-01-01), McElhiney
patent: 5790426 (1998-08-01), Robinson
patent: 5839105 (1998-11-01), Ostendorf
patent: 5861891 (1999-01-01), Becker
patent: 5884282 (1999-03-01), Robinson
patent: 5884305 (1999-03-01), Kleinberg
patent: 5960435 (1999-09-01), Rathmann
patent: 5983224 (1999-11-01), Singh
Jeffrey Scott Vitter. Random Sampling With A Reservoir. ACM Transactions on Mathematical Software, 11(1):37-57, 1985.
Martin Ester, et al. A Density-Based Algorithm For Discovering Clusters In Large Spatial Database With Noise. In International Conference on Knowledge Discovery in Databases and Data Mining (KDD-96), Montreal, Canada, Aug. 1996.
Tian Zhang, et al. Birch: An Efficient Data Clustering Method For Very Large Databases. In Proceedings of the ACM SIGMOD Conference on Management of Data, pp. 103-114, Montreal, Canada, Jun. 1996.
Eui-Hong Han, et al. Clustering Based On Association Rule Hypergraphs. Technical report, 1997 SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery, Jun. 1997.
Martin Ester, et al. A Database Interface For Clustering In Large Spatial Databases. In International Conference on Knowledge Discovery in Databases and Data Mining (KDD-95), Montreal, Canada, Aug. 1995.
Raymond T. Ng, et al. Efficient And Effective Clustering Methods For Spatial Data Mining. In Proc. of the VLDB Conference, Santiago, Chile, Sep. 1994.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method, apparatus and programmed medium for clustering databases does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method, apparatus and programmed medium for clustering databases, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method, apparatus and programmed medium for clustering databases will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-1184150

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.