Maintaining a random sample of a relation in a database in the p

Data processing: database and file management or data structures – Database design – Data structure types

Patent

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

707 3, 707 4, 707101, 707102, G06F 1730

Patent

active

060120649

ABSTRACT:
Techniques for maintaining a random sample of a relation in a database in the presence of updates to the relation. The random sample of the relation is referred to as a "backing sample," and it is maintained in the presence of insert, modify and delete operations involving the relation. When a new tuple is inserted into the relation, a sample of the given tuple is added to the backing sample if the size of the backing sample is below an upper bound. Otherwise, a randomly-selected tuple of the backing sample is replaced with the new tuple if a sample of the new tuple must be inserted into the backing sample to maintain randomness or another characteristic. When a tuple in the relation is the subject of a modify operation, the backing sample is left unchanged if the modify operation does not affect an attribute of interest to an application which uses the backing sample. Otherwise, a value field in a sample of the tuple in the backing sample is updated. When a tuple is deleted from the relation, any sample of that tuple in the backing sample is removed. A new backing sample may be computed if this removal causes the size of the backing sample to fall below a prespecified lower bound. The backing sample can be of a size which is negligible in comparison to the relation, and need only be modified very infrequently. As a result, its overhead in terms of computation time and storage space is minimal.

REFERENCES:
patent: 4956774 (1990-09-01), Shibamiya et al.
patent: 5379422 (1995-01-01), Antoshenkov
patent: 5555191 (1996-09-01), Hripcsak
patent: 5630120 (1997-05-01), Vachey
patent: 5675786 (1997-10-01), McKee et al.
patent: 5696964 (1997-12-01), Cox et al.
patent: 5758338 (1998-05-01), Faloutsos et al.
patent: 5794229 (1998-08-01), French et al.
patent: 5893090 (1999-04-01), Friedman et al.
Harangsri, B. et al., selectiveity estimation for joins using systematic sampling, database and expert system applications, 1997, proceedings, eight international workshop, and 384-389, Sep. 1997.
Zaki, M.J et al., evaluation of sampling for data mining of asociation rules, research issues in data engineering, 1997, proceedings, seventh international workshop, and 42-50, Apr. 1997.
Duncan, GT, et al., microdata disclosure limitation in statistical databases: query size and random sample query control, research in security and privacy, 1991, proceedings;l 1991 IEEE computer society symposium, and 278-287, May 1991.
Olken, F, et al., maintenance of materialized views of sampling queries, data engineering, 1992, proceedings, eighth international conference and 632-641, Feb. 1992.
Frank Olken et al., "Random Sampling from Databases--A Survey," Information and Computing Sciences Div., CA and Management Information Systems, CA, pp. 1-55, Mar. 1994; Draft of Mar. 22, 1994.
Frank Olken, "Random Sampling from Databases," PhD thesis, University of California, Berkeley, pp. 1-158, Jun. 1993.
J.S. Vitter, "Random Sampling with a Reservoir," ACM Transactions on Mathematical Software, vol. 11, No. 1, pp. 37-57, Mar. 1985.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Maintaining a random sample of a relation in a database in the p does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Maintaining a random sample of a relation in a database in the p, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Maintaining a random sample of a relation in a database in the p will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-1080509

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.