Parallel partition-wise joins

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S793000, C707S793000

Reexamination Certificate

active

06609131

ABSTRACT:

FIELD OF THE INVENTION
The present invention relates to computer systems and, more particularly, to techniques for performing joins between objects within computer systems.
BACKGROUND OF THE INVENTION
In conventional relational database tables, rows are inserted into the table without regard to any type of ordering. Consequently, when a user submits a query that selects data from the table based on a particular value or range of values, the entire table has to be scanned to ensure that all rows that satisfy the criteria are identified. Partitioning is a technique that, in certain situations, avoids the need to search an entire table (or other database object).
With partitioning, an object, such as a database table, is divided up into sub-tables, referred to as “partitions”. The most common form of partitioning is referred to range partitioning. With range partitioning, each individual partition corresponds to a particular range of values for one or more columns of the table. For example, one column of a table may store date values that fall within a particular year, and the table may be divided into twelve partitions, each of which corresponds to a month of that year. All rows that have a particular month in the date column would then be inserted into the partition that corresponds to that month. In this example, partitioning the table will increase the efficiency of processing queries that select rows based on the month contained in the date column. For example, if a particular query selected all rows where months equals January, then only the partition associated with the month of January would have to be scanned.
Typically, the criteria used to partition a database object is specified in the statement that creates the database object. For example, the following Structured Query Language SQL) statement creates a table “sales” that is range partitioned based on date values contained in a column named “saledate”:
create table sales
(saledate DATE,
productid NUMBER, . . . )
partition by range (saledate)
partition sal
94
Q
1
values less than to_ate (yy-mm-dd, ‘94-04-01’)
partition sal
94
Q
2
values less than to_ate (yy-mm-dd, ‘94-07-01’)
partition sal
94
Q
3
values less than to_ate (yy-mm-dd, ‘94-10-01’)
partition sal
94
Q
4
values less than to_ate (yy-mm-dd, ‘95-01-01’)
Execution of this statement creates a table named “sales” that includes four partitions: sal
94
Q
1
, sal
94
Q
2
, sal
94
Q
3
, and sal
94
Q
4
. The partition named sal
94
Q
1
includes all rows that have a date less than 94-04-01 in their saledate column. The partition named sal
94
Q
2
includes all rows that have a date greater than or equal to 94-04-01 but less than 94-07-01 in their saledate column. The partition named sal
94
Q
3
includes all rows that have a date greater than or equal to 94-07-01 but less than 94-10-01 in their saledate column. The partition named sal
94
Q
4
includes all rows that have a date greater than or equal to 94-10-01 but less than 95-01-01 in their saledate column.
When a database server receives a request to perform an operation, the database server makes a plan of how to execute the query. If the operation involves accessing a partitioned object, part of making the plan involves determining which partitions of the partitioned object, if any, can be excluded from the plan (i.e. which partitions need not be accessed to execute the query). The process of excluding partitions from the execution plan of a query that accesses a partitioned object is referred to as “partition pruning”.
Unfortunately, conventional pruning techniques can only be applied to a limited set of statements. For example, the database server can perform partition pruning when the statement received by the database server explicitly limits itself to a partition or set of partitions. Thus, the database server can exclude from the execution plan of the statement “select * from sales PARTITION(sal
94
Q
1
)” all partitions of the sales table other than the sal
94
Q
1
partition.
The database server can also perform partition pruning on statements that do not explicitly limit themselves to particular partitions, but which select data based on the same criteria that was used to partition the partitioned object. For example, the statement:
select * from sales where saledate between (94-04-01) and (94-07-01) does not explicitly limit itself to particular partitions. However, because the statement limits itself based on the same criteria (saledate values) as was used to partition the sales table, the database server is able to determine, based on the selection criteria of the statement and the partition definitions of the table, which partitions need not be accessed during execution of the statement. In the present example, the database server would be able to perform partition pruning that limits the execution plan of the statement to sal
94
Q
2
.
Similarly, database servers can perform partition pruning for queries with WHERE clauses that (1) specify equalities that involve the partition key (e.g. where saledate=94-02-05), (2) include IN lists that specify partition key values (e.g. where saledate IN (94-02-05, 94-03-06)), and (3) include IN subqueries that involve the partition key (e.g. where salesdate in (select datevalue from T)).
Another form of partitioning is referred to as hash partitioning. According to hash partitioning, one or more values from each record are applied to a hash function to produce a hash value. A separate partition is established for each possible hash value produced by the hash function, and rows that hash to a particular value are stored within the partition that is associated with that hash value. Similar to range based partitioning, hash partitioning increases the efficiency of processing certain types of queries. For example, when a query selects all rows that contain a particular value in the column that is used to perform the hash partitioning, the database server can apply the value in the query to the hash function to produce a hash value, and then limit the scan of the table to the partition that corresponds to the hash value thus produced.
A table that is hash partitioned into four partitions may be created by the following statement:
create table sales
(saledate DATE,
productid NUMBER, . . . )
partition by hash (saledate)
partitions 4;
Similar to range partitions, hash partitions may be used for queries with WHERE clauses that (1) specify equalities that involve the partition key, (2) include IN lists that specify partition key values, and (3) include IN subqueries that involve the partition key. However, unlike range-based partitioning, partition pruning cannot be performed for statements with predicates that specify ranges of partition key values. Consequently, hash-based partitioning is often used when the nature of the partition key is such that range-based queries are unlikely, such as when the partition key is “social security number”, “area code” or “zip code”.
Due to the benefits that result from partition pruning, it is clearly desirable to provide techniques for performing partition pruning for a wider variety of statements.
SUMMARY OF THE INVENTION
Techniques are provided to expand the concept of partitioning in variety of ways. For example, both hash partitioning and range partitioning can be characterized as single-dimension partitioning because they use a single criteria to divide up the partitioned objects. One aspect of the invention is to perform multiple-dimension partitioning. In multiple-dimension partitioning, a database object is divided into partitions based on one criteria, and each of those resulting partitions is divided into sub-partitions based on a second criteria. The process of partitioning partitions based on different criteria may be repeated across any number of dimensions. In addition, entirely different partitioning techniques may be used for each level of partitioning. For example, database objects may be partitioned across one dimension using range-based partitioning, and each of those range-based partitions may be partitioned acro

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Parallel partition-wise joins does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Parallel partition-wise joins, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Parallel partition-wise joins will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3100000

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.