Structure and method for efficient parallel high-dimensional sim

Data processing: database and file management or data structures – Database design – Data structure types

Patent

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

707 6, 707 7, 710132, 39580011, G06F 1730

Patent

active

059874686

ABSTRACT:
Multidimensional similarity join finds pairs of multi-dimensional points that are within some small distance of each other. Databases in domains such as multimedia and time-series can require a high number of dimensions. The .epsilon.-k-d-B tree has been proposed as a data structure that scales better as number of dimensions increases compared to previous data structures such as the R-tree (and variations), grid-file, and k-d-B tree. We present a cost model of the .epsilon.-k-d-B tree and use it to optimize the leaf size. This new leaf size is shown to be better in most situations compared to previous work that used a constant leaf size. We present novel parallel procedures for the .epsilon.-k-d-B tree. A load-balancing strategy based on equi-depth histograms is shown to work well for uniform or low-skew situations, whereas another based on weighted, equi-depth histograms works far better for high-skew datasets. The latter strategy is only slightly slower than the former strategy for low skew datasets. The weights for the latter strategy are based on the same cost model that is used to determine optimal leaf sizes.

REFERENCES:
patent: 4811210 (1989-03-01), McAulay
patent: 5579471 (1996-11-01), Barber et al.
patent: 5647058 (1997-07-01), Agrawal et al.
patent: 5842031 (1998-11-01), Barker et al.
Agrawal, R., et al., "Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases", Proceedings of the 21st VLDB Conference, Zurich, Switzerland, 1995, pp. 490-501.
Alsabti, K., et al., "A One-Pass Algorithm for Accurately Estimating Quantiles for Disk-Resident Data", Proceedings of the 23rd VLDB Conference, Athens, Greece, 1997, pp. 1-9.
Alsabti, K., et al., "An Efficient Parallel Algorithm for High Dimensional Similarity Join", Information Technology Group (ITG) of Hitachi America, Ltd., Oct. 8, 1997, pp. 1-27, Figs. 1-6.
Beckmann, N., et al., "The R*-tree: An Efficient and Robust Access Method for Points and Rectangles", Proc. ACM SIGMOD, Atlantic City, New Jersey, May 1990, pp. 322-331.
Bentley, J., "Multidimensional Binary Search Trees Used for Associative Searching", Communications of the ACM, vol. 18, No. 9, Sep. 1975, pp. 509-517.
Brinkhoff, T., et al., "Efficient Processing of Spatial Join Using R-trees", Proc. 1993 ACM SIGMOD Conf. on Management of Data, 1993, pp. 237-246.
Faloutsos, C., et al., "Analysis of Object-Oriented Spatial Access Methods", ACM SIGMOD, vol. 16, No. 3, 1987, pp. 426-439.
Faloutsos, C., et al., "Fast Subsequence Matching in Time-Series Databases", Proc. Of the ACM SIGMOD Conf. on Management of Data, May 1994, pp. 419-429.
Katayama, N., et al., "The SR-tree: An Index Structure for High-Dimensional Nearest Neighbor Queries", Proc. of ACM SIGMOD Int'l. Conf. on Management of Data, 1997, pp. 369-380.
Li, C., et al., "HierarchyScan: A Hierarchial Similarity Search Algorithm for Databases of Long Sequences", Proc. of the 12th Int'l Conf. On Data Engineering, New Orleans, Louisana, Feb. 1966, pp. 546-553.
Li, X., et al., "On the versatility of parallel sorting by regular sampling", Parallel Computing, vol. 19, No. 10, Oct. 1993, pp. 1079-1103.
Narasimhalu, A., et al., "Multimedia Information Systems: The Unfolding of a Reality", IEEE Computer, vol. 24, No. 8, Oct. 1991, pp. 6-8.
Niblack, W., et al., "The QBIC Project: Querying Images by Content Using Color, Texture, and Shape", SPIE 1993 Int'l Symposium on Electronic Imaging: Science and Technology, 1993, pp. 173-187.
Nievergelt, J., et al., "The Grid File: An Adaptable, Symmetric Multikey File Structure", ACM Transactions on Database Systems, vol., 9, No. 1, Mar. 1984, pp. 38-71.
Robinson, J. T., "The K-D-B-Tree: A Search Structure for Large Multidimensional Dynamic Indexes", Proc. 1981 ACM SIGMOD Conf. On Management of Data, pp. 10-18.
Roussopoulos, N., et al., "Direct Spatial Search on Pictorial Databases Using Packed R-trees", ACM SIGMOD, 1985, vol. 14, No. 4, pp. 17-31.
Shafer, J., et al., "Parallel Algorithms for High-dimensional Proximity Joins", Proceedings of the 23rd VLDB Conference, Athens, Greece, 1997, pp. 176-185.
Shim, K., et al., "High-dimensional Similarity Joins", Proc. of the 13th Int'l Conference on Data Engineering, Birmingham, U.K., Apr. 1997 (11 pages).
White, D., et al., "Similarity Indexing with the SS-tree", Proc. Of the 12th Int'l Conf. On Data Engineering, Feb. 1996, pp. 516-523.
Azadegan, S., "A Parallel Join Algorithm for SIMD Architectures", Journal Of Systems And Software, vol: 39, Issue: 3, Dec. 1997, pp. 265-280.
Brinkhoff, T., "Multi-Step Processings of Spatial Joins", SIGMOD Conference 1994 pp. 197-208, May 1994.
Koudas, N., "High dimensional similarity joins: algorithms and performance evaluation", Proceedings of 14th International Conference on Data Engineering, Feb. 1998 pp. 466-475.
Merrett, T., "Distribution Models Of Relations", Fifth International Conference on Very Large Data Bases, 1979. Oct. 1979, pp. 418-425.
Rotem, D., "Spatial Join indices", Proceedings of Seventh International Conference on Data Engineering, Apr. 1991. pp. 500-509.
Theodoridis, Y., "Cost models for join queries in spatial databases", Proceedings of 14th International Conference on Data Engineering, Feb. 1998 pp. 476-483.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Structure and method for efficient parallel high-dimensional sim does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Structure and method for efficient parallel high-dimensional sim, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Structure and method for efficient parallel high-dimensional sim will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-1337580

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.