Programmed medium for clustering large databases

Data processing: database and file management or data structures – Database design – Data structure types

Patent

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

707100, G06F 1730

Patent

active

060920726

ABSTRACT:
The present invention relates to a computer method, apparatus and programmed medium for clustering large databases. The present invention represents each cluster to be merged by a constant number of well scattered points that capture the shape and extent of the cluster. The chosen scattered points are shrunk towards the mean of the cluster by a shrinking fraction to form a representative set of data points that efficiently represent the cluster. The clusters with the closest pair of representative points are merged to form a new cluster. The use of an efficient representation of the clusters allows the present invention to obtain improved clustering while efficiently eliminating outliers.

REFERENCES:
patent: 4945549 (1990-07-01), Simon et al.
patent: 5040133 (1991-08-01), Feintuch et al.
patent: 5263120 (1993-11-01), Bickel
patent: 5325466 (1994-06-01), Kornacker
patent: 5452371 (1995-09-01), Bozinovic et al.
patent: 5555196 (1996-09-01), Asano
patent: 5675791 (1997-10-01), Bhide et al.
patent: 5696877 (1997-12-01), Iso
patent: 5706503 (1998-01-01), Poppen et al.
patent: 5710915 (1998-01-01), McElhiney
patent: 5784283 (1998-06-01), Pingali et al.
patent: 5796924 (1998-08-01), Errico et al.
patent: 5832182 (1998-11-01), Zhang et al.
patent: 5940832 (1999-08-01), Hamada et al.
patent: 5983224 (1998-08-01), Singh et al.
patent: 6012058 (2000-01-01), Fayyad et al.
Tian Zhang, et al. Birch: An Efficient Data Clustering Method For Very Large Databases. In Proceedings of the ACM SIGMOD Conference on Management of Data, pp. 103-114, Montreal, Canada, Jun. 1996.
Eui-Hong Han, et al. Clustering Based On Association Rule Hypergraphs. Technical report, 1997 SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery, Jun. 1997.
Martin Ester, et al. A Database Interface For Clustering In Large Spatial Databases. In International Conference on Knowledge Discovery in Databases and Data Mining (KDD-95), Montreal, Canada, Aug. 1995.
Raymond T. Ng, et al. Efficient And Effective Clustering Methods For Spatial Data Mining. In Proc. of the VLDB Conference, Santiago, Chile, Sep. 1994.
Jeffrey Scott Vitter. Random Sampling With A Reservoir. ACM Transactions on Mathematical Software, 11(1):37-57, 1985.
Martin Ester, et al. A Density-Based Algorithm For Discovering Clusters In Large Spatial Database With Noise. In International Conference on Knowledge Discovery in Databases and Data Mining (KDD-96), Montreal, Canada, Aug. 1996.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Programmed medium for clustering large databases does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Programmed medium for clustering large databases, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Programmed medium for clustering large databases will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2047850

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.