Data processing: database and file management or data structures – Database and file access – Query optimization
Reexamination Certificate
2011-07-26
2011-07-26
Channavajjala, Srirama (Department: 2157)
Data processing: database and file management or data structures
Database and file access
Query optimization
C707S600000, C707S698000, C707S719000, C707S747000
Reexamination Certificate
active
07987177
ABSTRACT:
The task of estimating the number of distinct values (DVs) in a large dataset arises in a wide variety of settings in computer science and elsewhere. The present invention provides synopses for DV estimation in the setting of a partitioned dataset, as well as corresponding DV estimators that exploit these synopses. Whenever an output compound data partition is created via a multiset operation on a pair of (possibly compound) input partitions, the synopsis for the output partition can be obtained by combining the synopses of the input partitions. If the input partitions are compound partitions, it is not necessary to access the synopses for all the base partitions that were used to construct the input partitions. Superior (in certain cases near-optimal) accuracy in DV estimates is maintained, especially when the synopsis size is small. The synopses can be created in parallel, and can also handle deletions of individual partition elements.
REFERENCES:
patent: 5530883 (1996-06-01), Baum et al.
patent: 5542089 (1996-07-01), Lindsay et al.
patent: 5727197 (1998-03-01), Burgess et al.
patent: 5802521 (1998-09-01), Ziauddin et al.
patent: 5832475 (1998-11-01), Agrawal et al.
patent: 5950185 (1999-09-01), Alon et al.
patent: 5999928 (1999-12-01), Yan
patent: 6061676 (2000-05-01), Srivastava et al.
patent: 6226629 (2001-05-01), Cossock
patent: 6732110 (2004-05-01), Rjaibi et al.
patent: 6738762 (2004-05-01), Chen et al.
patent: 6865567 (2005-03-01), Oommen et al.
patent: 7047230 (2006-05-01), Gibbons
patent: 7124146 (2006-10-01), Rjaibi et al.
patent: 2002/0083033 (2002-06-01), Abdo et al.
patent: 2002/0198867 (2002-12-01), Lohman et al.
patent: 2003/0208488 (2003-11-01), Perrizo
patent: 2004/0049492 (2004-03-01), Gibbons
patent: 2004/0059743 (2004-03-01), Burger
patent: 2004/0133567 (2004-07-01), Witkowski et al.
patent: 2005/0097072 (2005-05-01), Brown et al.
patent: 2005/0147240 (2005-07-01), Agrawal et al.
patent: 2005/0147246 (2005-07-01), Agrawal et al.
patent: 2006/0047683 (2006-03-01), Lakshminarayan et al.
patent: 2006/0218123 (2006-09-01), Chowdhuri et al.
patent: 2008/0120274 (2008-05-01), Cruanes et al.
patent: 2010/0010989 (2010-01-01), Li et al.
patent: WO 2007/134407 (2007-11-01), None
patent: WO 2010/104902 (2010-09-01), None
Phillip B. Gibbons, Distinct Sampling for highly-Accurate answers to Distinct values queries and event reports, proceedings of the 27th VLDB, 2001, 10 pages.
Kevin Beyer1 et al. “On Synopses for DistinctValue Estimation Under Multiset Operations”,SIGMOD'07, Jun. 12-14, 2007,, pp. 199-.
Neoklis Polyzotis, “SelectivityBased Partitioning: A DivideandUnion Paradigm for Effective Query Optimization”, CIKM'05, Oct. 31-Nov. 5, 2005.
Abdelkader Hameurlain et al. “CPU and incremental memory allocation in dynamic parallelization of SQL queries”,Parallel Computing 28 (2002) 525-556.
Damianos Chatziantoniou et al. “Partitioned optimization of complex queries”,Information Systems 32 (2007) 248-282.
Beyer Kevin Scott
Gemulla Rainer
Haas Peter Jay
Reinwald Berthold
Sismanis John
Channavajjala Srirama
International Business Machines - Corporation
IP Authority, LLC
Soundararajan Ramraj
LandOfFree
Method for estimating the number of distinct values in a... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method for estimating the number of distinct values in a..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method for estimating the number of distinct values in a... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2703513