Server side sampling of databases

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S793000, C707S793000, C709S203000

Reexamination Certificate

active

06988108

ABSTRACT:
A system and method for use with a data mining application for a large database having a large number of records. A selection attribute is chosen from one of a plurality of attributes contained by records within the database. Records are scanned in the database and a randomizing function is applied to the selection attribute of each record to create a randomized record value. A selection criteria is then applied to identify records for inclusion within a subset of records (smaller than the original data set) by comparing the randomized record value of each record with the selection criteria. The subset of records having a randomized record value satisfying the selection criteria approximates the entire database but takes up less memory and can be evaluated or scanned much more quickly.

REFERENCES:
patent: 5727199 (1998-03-01), Chen et al.
patent: 5960431 (1999-09-01), Choy
patent: 6012058 (2000-01-01), Fayyad et al.
patent: 6026398 (2000-02-01), Brown et al.
patent: 6049797 (2000-04-01), Guha et al.
patent: 6236985 (2001-05-01), Aggarwal et al.
patent: 6289354 (2001-09-01), Aggarwal et al.
patent: 6347310 (2002-02-01), Passera
patent: 6449612 (2002-09-01), Bradley et al.
patent: 6510427 (2003-01-01), Bossemeyer, Jr. et al.
patent: 6519604 (2003-02-01), Acharya et al.
patent: 6532458 (2003-03-01), Chaudhuri et al.
patent: 6633882 (2003-10-01), Fayyad et al.
patent: 6772166 (2004-08-01), Hildreth
patent: 6785684 (2004-08-01), Adbo
patent: 6889221 (2005-05-01), Luo et al.
patent: 2002/0073138 (2002-06-01), Gilbert et al.
patent: 2002/0152208 (2002-10-01), Bloedorn
patent: 2002/0198863 (2002-12-01), Anjur et al.
Motwani et al., Randomized Algorithms, Mar. 1996, ACM, pp. 33-37.
Wass et al., Counting Eumerating, and Sampling of Execution Plans in a Cost-Based Query Optimizer, Mar. 2000, ACM, p, 499-509.
Karloff et al., Randomized Algorithms and Pseudorandom Numbers, Jul. 1993, ACM, pp 454-476.
Ordonez, SEQLEM: Fast Clustering in SQL using the EM Algorithm, 2000, ACM, pp. 559-570.
Snyder, Using Transact-SQL and Simulation Techniques to Create Virtual M&M's, 2002, ACM, pp. 153-164.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Server side sampling of databases does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Server side sampling of databases, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Server side sampling of databases will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3528291

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.