Method for partitioning multi-dimensional data sets into...

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S240000, C705S035000, C707S793000, C707S793000, C707S793000, C707S793000, C707S793000, C707S793000, C707S793000, C709S201000

Reexamination Certificate

active

06272498

ABSTRACT:

FIELD OF THE INVENTION
The present invention pertains to methods for processing data, and more particularly to a method for partitioning a multi-dimensional data set into a plurality of rectangular-shaped partitions which can be processed and stored more efficiently than the non-partitioned data set
BACKGROUND OF THE INVENTION
The speed and efficiency at which a computer having a fixed processing capability accomplishes a delineated task is directly proportional to the quantity of data of being processed. To accomplish tasks more quickly, some conventional processing methods partition a data set in a database into a plurality of smaller data sets which can be processed together more quickly than the non-partitioned data set from which they are derived, thereby increasing the speed at which the data is processed. One widely used method for processing data in this manner is to construct a histogram approximation of a data set comprised of a plurality of numbers by partitioning the data set into a plurality of subsets, i.e., tiles, and then calculating the average value of the numbers in each tile, which average values are used for processing purposes.
The three most widely used types of partitions constructed for two-dimensional data sets are: an arbitrary partition, shown in
FIG. 1A
, which has no restrictions on the arrangement of tiles; an hierarchical partition, shown in
FIG. 1B
, in which an array is vertically or horizontally separated into two disjointed tiles which are each further hierarchically partitioned; and a p×p partition, shown in
FIG. 1C
, in which the rows and columns are each partitioned into disjointed tiles.
A partitioning metric for evaluating the partition is used to construct the histogram, wherein the metric is selected to be less than a fixed performance constraint value &dgr; which varies depending upon the particular task to be performed. An algorithm is then used to determine the partition to be used. Optimal partitioning is obtained by constructing tiles having a minimum variation between the average value of the numbers in the tiles and each of the numbers themselves.
Partitioning of data sets is used for various purposes including, but not limited to, scheduling when a computer will perform various tasks, as well as for selectivity estimation purposes such as determining how many people in a given population data set fall within each one of a plurality of age distributions. For scheduling tasks, &dgr; represents the maximum time within which a task is to be performed, while for selectivity estimation purposes, &dgr; represents the upper limit on the acceptable error of the result The memory available for processing a data set determines the number of partitions into which the data set can be divided, and thus also serves as a limitation on the selected metric.
Conventional methods for partitioning data sets suffer from significant drawbacks. Specifically, such methods partition the data at very slow speeds, even for small data sets, and require large amounts of computer memory to process the data.
SUMMARY OF INVENTION
A method for partitioning a multi-dimensional data set into a minimum number of rectangular-shaped partitions that enable the data to be processed and stored more efficiently than the non-partitioned data set while satisfying certain specified performance constraints, wherein the data set is partitioned using an algorithm.


REFERENCES:
patent: 5226109 (1993-07-01), Dawson et al.
patent: 5495539 (1996-02-01), Sieverding
patent: 5497486 (1996-03-01), Stolfo et al.
patent: 5546499 (1996-08-01), Lynch et al.
patent: 5551027 (1996-08-01), Choy
patent: 5701467 (1997-12-01), Freeston
patent: 5710915 (1998-01-01), McElhiney
patent: 5781906 (1998-07-01), Aggrawal et al.
patent: 5864857 (1999-01-01), Ohata et al.
patent: 5878409 (1999-03-01), Baru et al.
patent: 6003036 (1999-12-01), Martin
patent: 6014656 (2000-01-01), Hallmark et al.
patent: 6122628 (2000-09-01), Castelli et al.
Polo, A. et al., “Multi-dimensional Partitioning for Massively Parallel Database Machines”, Proceedings of the 1995 Euromicro Workshop on Parallel and Distributed Processing, Jan. 25-27, 1995, pp. 244-241.*
Wang, Jenny et al., Minimization of Memory Access Overhead for Multidimensional DSP Applications via Multilevel Partitioning, IEEE Transactions On Circuits And Systems-II: Analog and Digital Signal Processing, vol. 44, No. 9, Sep. 1997, pp. 741-753.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method for partitioning multi-dimensional data sets into... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method for partitioning multi-dimensional data sets into..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method for partitioning multi-dimensional data sets into... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2501733

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.