Single pass space efficent system and method for generating appr

Data processing: database and file management or data structures – Database design – Data structure types

Patent

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

707 6, 707 7, 707 2, G06F 1730

Patent

active

061086584

ABSTRACT:
A system and method for finding an .epsilon.-approximate .phi.-quantile data element of a data set with N data elements in a single pass over the data set. The .epsilon.-approximate .phi.-quantile data element is guaranteed to lie within a user-specified approximation error .epsilon. of a true .phi.-quantile data element being sought. B buffers, each having a capacity of k elements, initially are filled with sorted data elements from the data set, with the values of b and k depending on .epsilon. and N. The buffers are then collapsed into an output buffer, with the remaining buffers then being refilled with data elements, collapsed (along with the previous output buffer), and so on until the entire data set has been processed and a single output buffer remains. A data element of the output buffer corresponding to the .epsilon.-approximate .phi.-quantile is then output as the approximate .phi.-quantile data element. If desired, the system and method can be practiced with sampling to even further reduce the amount of space required to find a desired .epsilon.-approximate .phi.-quantile data element.

REFERENCES:
patent: 4379948 (1983-04-01), Ney et al.
patent: 4530076 (1985-07-01), Dwyer
patent: 4817158 (1989-03-01), Picheny
patent: 4829427 (1989-05-01), Green
patent: 5018088 (1991-05-01), Higbie
patent: 5091967 (1992-02-01), Ohsawa
patent: 5105469 (1992-04-01), MacDonald et al.
patent: 5345585 (1994-09-01), Iyer et al.
patent: 5379419 (1995-01-01), Heffernan et al.
patent: 5664171 (1997-09-01), Agrawal et al.
patent: 5864841 (1999-01-01), Agrawal et al.
Sridhar Rajagopalan et al, "Approximate Order Statistics in One Pass and Little Memory: Theory and Database Applications", IBM Almaden Research Center, Nov. 3, 1997, pp. 1-18.
Viswanath Poosala et al, "Improved Histograms for Selectivity Estimation of Range Predicates", SIGMOD, Jun. 1996, pp. 294-305.
Raj Jain & Imrich Chlamtac, "The P.sup.2 Algorithm for Dynamic Calculation of Quantiles and Histograms Without Storing Observations", Communications of the ACM, vol. 28, No. 10, Oct. 1995, pp. 1076-1085.
Khaled Alsabti et al, "A One-Pass Algorithm for Accurately Estimating Quantiles for Disk-Resident Data", Proceedings of the 23rd VLDB Conference, Athens, Greece, 1997, pp. 346-355.
Richard J. Lipton et al, "Efficient sampling strategies for relational database operations", Elsevier Science Publishers A.V., 1993, pp. 195-226.
Rakesh Agrawal & Arun Swami, "A One-Pass Space-Efficient Algorithm for Finding Quantiles", 7th Int'l Conf. Management of Data (COMAD -95), Prune, India 1995, pp. 1-12.
L.G. Valiant, "A Theory of the Learnable", Communications of the ACM, vol. 27, No. 11, Nov. 1984, pp. 1134-1142.
Ira Pohl, "A Minimum Storage Algorithm for Computing The Median", IBM Thomas J. Watson Research Center, Nov. 17, 1969, pp. 1-6.
Dorit Dor & Uri Zwick, "Selecting the Median", Proceedings of the Sixth Annual ACM-SIAM Symposium On Discrete Algorithms, San Francisco, California 1995, Chapter 4, pp. 28-37.
David J. DeWitt et al, "Parallel Soring on a Shared-Nothing Architecture using Probabilistic Splitting", ACM SIGMOD & SIGARCH, IEEE Computer Society, Dec. 1991, pp. 280-291.
P. Griffiths Selinger et al, "Access Path Selection In a Relational Database Management System", ACM-SIGMOD 1979, pp. 23-34.
Mike Paterson, "Progress in Selection", Dept. of Computer Science.
William Feller, "An Introduction to Probability Theory and Its Applications", Chapter IX, vol. 1, 3rd Ed., pp. 212-242.
R. Agrawal, et al., "Mining Association Rules Between Sets of Items in Large Databases", Proceedings of the ACM SIGMOD Conference on Management of Data, pp. 207-216, Washington, D.C., May 1993.
Aditya P. Gurajada et al, "Equidepth Partitioning of a Data Set based on Finding its Medians", IEEE, Feb. 4, 1991, pp. 92-101.
Bruce W. Schmeiser & Stuart Jay Deutsch, "Quantile Estimation From Grouped Data: The Cell Midpoint", Commun. Statist. Simula. Computa, B6(3), 1997, pp. 221-234.
Bruce W. Schmeiser, "The Generation of Order Statistics In Digistal Computer Simulation: A Survey", pp. 137-140.
M. Muralikrishna & David J. DeWitt, "Equi-Depth Histograms For Estimating Selectivity Factors For Multi-Dimensional Queries", ACM, 1988, pp. 28-36.
Gregory Piatetsky-Shapiro et al, "Accurate Estimation of the Number of Tuples Satisfying A Condition", ACM, 1984, pp. 256-276.
J.J. Munro et al, "Selection and Sorting With Limited Storage", Theoretical Computer Science 12, 1980, pp. 315-323.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Single pass space efficent system and method for generating appr does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Single pass space efficent system and method for generating appr, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Single pass space efficent system and method for generating appr will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-593162

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.