Method and apparatus for reducing the computational requirements

Data processing: database and file management or data structures – Database design – Data structure types

Patent

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

707 3, 707 4, 707 5, G06F 1730

Patent

active

059832240

ABSTRACT:
The present invention is directed to an improved data clustering method and apparatus for use in data mining operations. The present invention determines the pattern vectors of a k-d tree structure which are closest to a given prototype cluster by pruning prototypes through geometrical constraints, before a k-means process is applied to the prototypes. For each sub-branch in the k-d tree, a candidate set of prototypes is formed from the parent of a child node. The minimum and maximum distances from any point in the child node to any prototype in the candidate set is determined. The smallest of the maximum distances found is compared to the minimum distances of each prototype in the candidate set. Those prototypes with a minimum distance greater than the smallest of the maximum distances are pruned or eliminated. Pruning the number of remote prototypes reduces the number of distance calculations for the k-means process, significantly reducing the overall computation time.

REFERENCES:
patent: 5040133 (1991-08-01), Feintuch et al.
patent: 5561722 (1996-10-01), Watari et al.
patent: 5710916 (1998-01-01), Barbara et al.
patent: 5742811 (1998-04-01), Agrawal et al.
patent: 5799301 (1998-08-01), Castelli et al.
patent: 5819258 (1998-10-01), Vaithyanathan et al.
patent: 5819266 (1998-10-01), Agrawal et al.
patent: 5832182 (1998-11-01), Zhang et al.
patent: 5835891 (1998-11-01), Stoneking
patent: 5848404 (1998-12-01), Hafner et al.
patent: 5857179 (1999-01-01), Vaithyanathan et al.
Arya, et al. "Accounting for boundary effects in nearest neighbor searching", Discrete & Computational Geometry, vol. 16, No. 2, Abstract Only, Sep. 1996.
Nene, et al., "A simple algorithm for nearest neighbor search in high dimensions", IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 19, No. 9, Abstract Only, Sep. 1997.
High Dimensional Similarity Joins by Kyuseok Shim, R. Srikant and Rakesh Agrawal, IBM Almaden Research Center, p. 1-11, Apr. 1997.
Fast Similarity Search in the Presence of Noise, Scaling and Translation in Time-Series Databases by Rakesh Agrawal, King-Ip Lin, H. Sawhney and Kyuseok Shim, Proceedings of the 21.sup.st VLDB Conference, Zurich, Switzerland 1995, p. 1-12.
Parallel Algorithms for High-dimensional Proximity Joins by John C. Shafer and Rakesh Agrawal, Processing of the 23r.sup.d VLDB Conference, Athens, Greece 1997, pp. 176-185.
BIRCH: An Efficient Data Clustering Method for Very Large Databases by T. Zhang, R. Ramakrishnan and M. Livny of University of Wisconsin Computer Sciences Dept, pp. 103-114, 1996.
Large-Scale Parallel Data Clustering by Dan Judd, Philip K. McKinley and Anil K. Jain, 1996 International Conference on Pattern Recognition, Vienna, Austria, Aug. 1996, pp. 1-7.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and apparatus for reducing the computational requirements does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method and apparatus for reducing the computational requirements, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for reducing the computational requirements will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-1469646

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.