Data processing: database and file management or data structures – Database design – Data structure types
Reexamination Certificate
2006-01-17
2006-01-17
Wong, Leslie (Department: 2167)
Data processing: database and file management or data structures
Database design
Data structure types
C707S793000, C707S793000, C709S203000
Reexamination Certificate
active
06988108
ABSTRACT:
A system and method for use with a data mining application for a large database having a large number of records. A selection attribute is chosen from one of a plurality of attributes contained by records within the database. Records are scanned in the database and a randomizing function is applied to the selection attribute of each record to create a randomized record value. A selection criteria is then applied to identify records for inclusion within a subset of records (smaller than the original data set) by comparing the randomized record value of each record with the selection criteria. The subset of records having a randomized record value satisfying the selection criteria approximates the entire database but takes up less memory and can be evaluated or scanned much more quickly.
REFERENCES:
patent: 5727199 (1998-03-01), Chen et al.
patent: 5960431 (1999-09-01), Choy
patent: 6012058 (2000-01-01), Fayyad et al.
patent: 6026398 (2000-02-01), Brown et al.
patent: 6049797 (2000-04-01), Guha et al.
patent: 6236985 (2001-05-01), Aggarwal et al.
patent: 6289354 (2001-09-01), Aggarwal et al.
patent: 6347310 (2002-02-01), Passera
patent: 6449612 (2002-09-01), Bradley et al.
patent: 6510427 (2003-01-01), Bossemeyer, Jr. et al.
patent: 6519604 (2003-02-01), Acharya et al.
patent: 6532458 (2003-03-01), Chaudhuri et al.
patent: 6633882 (2003-10-01), Fayyad et al.
patent: 6772166 (2004-08-01), Hildreth
patent: 6785684 (2004-08-01), Adbo
patent: 6889221 (2005-05-01), Luo et al.
patent: 2002/0073138 (2002-06-01), Gilbert et al.
patent: 2002/0152208 (2002-10-01), Bloedorn
patent: 2002/0198863 (2002-12-01), Anjur et al.
Motwani et al., Randomized Algorithms, Mar. 1996, ACM, pp. 33-37.
Wass et al., Counting Eumerating, and Sampling of Execution Plans in a Cost-Based Query Optimizer, Mar. 2000, ACM, p, 499-509.
Karloff et al., Randomized Algorithms and Pseudorandom Numbers, Jul. 1993, ACM, pp 454-476.
Ordonez, SEQLEM: Fast Clustering in SQL using the EM Algorithm, 2000, ACM, pp. 559-570.
Snyder, Using Transact-SQL and Simulation Techniques to Create Virtual M&M's, 2002, ACM, pp. 153-164.
Bernhardt Jeffrey R.
Vinarsky Ilya
Microsoft Corporation
Microsoft Corporation
Wong Leslie
LandOfFree
Server side sampling of databases does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Server side sampling of databases, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Server side sampling of databases will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3528291