Outerjoin and antijoin reordering using extended eligibility...

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Reexamination Certificate

active

06665663

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates in general to database management systems performed by computers, and in particular, to the optimization of queries that reference joins other than the standard “inner” join, specifically “outerjoins” and “antijoins.”
2. Description of Related Art
(Note: This application references a number of different publications as indicated throughout the specification by mnemonics enclosed in brackets, e.g., [Authorxx], wherein Author is the author's name (or abbreviation thereof) and xx is the year of publication. A list of these different publications with their associated mnemonics can be found in Section 7 entitled “References” in the “Detailed Description of the Preferred Embodiment.” Each of these publications is incorporated by reference herein.)
Computer systems incorporating Relational DataBase Management System (RDBMS) software using a Structured Query Language (SQL) interface are well known in the art. The SQL interface has evolved into a standard language for RDBMS software and has been adopted as such by both the American National Standards Institute (ANSI) and the International Standards Organization (ISO).
Relational join combines information from two base tables by creating pairs of matching rows. Rows without any matches are simply discarded. These kinds of joins are referred to as inner joins. In addition to inner joins, there are two other types of joins commonly seen in relational database systems, namely outerjoins and antijoins.
Outerjoin [Codd79] is a modification of inner join that preserves all information from one or both of its arguments. Outerjoins can be further categorized into left, right (single-sided), or fall (two-sided) outerjoin, depending on which side needs to be preserved. For example, the following SQL query will return all the department names and employees within each department. For those departments without employees, the department names are listed with the employee name set to NULL.
Example:
An outerjoin query:
SELECT
department.dname, employee.ename
FROM
department LEFT JOIN employee
ON department.no = employee.dno
Outerjoins are important because they are frequently used in the following applications [GR97]: (a) certain OLAP queries where we need to preserve rows from the fact table with unknown (or missing) dimensional values; (b) constructing hierarchical views that preserve objects with no children and (c) queries generated by external tools.
Recently, [STH+99] proposed a way of using relational database to handle XML queries. Outerjoins are needed to express XML paths. Outerjoins are also useful for exporting relational data into XML documents [SSB+00]. For example, to generate an XML document describing a customer from relational tables, a potential implementation will issue the following query:
SELECT
cust.*, acct.*, porder.*, pay.*, item.*
FROM
Customer cust
LEFT JOIN Account acct ON cust.id = acct.custId
LEFT JOIN PurchOrder porder ON cust.id = porder.custId
LEFT JOIN Item item ON porder.id = item.poId
LEFT JOIN Payment pay ON porder.id = pay.poId
Antijoin is useful for handling negated nested queries. Straightforward evaluation of those queries would require using the nested iteration method, which may be very inefficient. [Kim82] proposed to transform negated nested queries into antijoins. Since join methods other than the nested loops join could potentially be used, this transformation gives the optimizer more freedom. The following example shows such a transformation. An antijoin preserves a row from the outer table if there is no match from the inner table. Otherwise, the row is discarded. Antijoin queries occur a lot in commercial systems. For example, negated nested queries are often used to maintain referential integrity.
Example:
Transform a negated nested query into an antijoin query.
Original query: list the name of all departments with no employees.
SELECT
department.dname
FROM
department
WHERE
NOT EXISTS
(SELECT * FROM employee
WHERE department.no = employee.dno)
After transformation:
SELECT
department.dname
FROM
department ANTIJOIN employee
ON department.no = employee.dno
When there are only inner joins in a query, a query optimizer considers all possible join orders and selects the cheapest execution plan. Changing the order of join evaluation is a powerful optimization technique and can improve execution time by orders of magnitude. However, when outerjoins and/or antijoins are present in addition to inner joins, changing the order of evaluation is complicated. This is because these three types of joins are not always associative with each other. Two invalid transformations (verified in [RG90]) are shown below:
R LEFT JOIN (S INNER JOIN T)≠(R LEFT JOIN S) INNER JOIN T
R LEFT JOIN (S ANTIJOIN T)≠(R LEFT JOIN S) ANTIJOIN T
As a result, not all orders will give the same answer as the original query, unless special consideration is taken.
The problem of outerjoin reordering has been studied before in [RG90,GR92,BGI95,GR97], with [GR97] being the most comprehensive.
In [GR97], the authors identify a special class of query called simple join/outerjoin queries. A simple query has the property that its query graph (without specifying the join orders) unambiguously determines the semantics of the query. A conflicting set for each join predicate p is then computed through some graph analysis, which includes all join predicates that conflict with p. The information stored in the conflicting set can be used to form proper join orders in a conventional bottom-up join optimizer. Basically, when a join predicate p is used to combine two subplans, the optimizer checks if p conflicts with any join predicates used in either subplan. If so, a generalized outerjoin will be introduced to guarantee the correctness. This is described in more detail in Section 5 below.
There are two limitations in the approach used in [GR97]. First of all, it provides solutions to simple queries only. While simple queries are an important class of query, there are many real-world queries that are not simple. For example, predicates with more than one conjunct, predicates referencing more than two tables, and Cartesian products are not allowed in simple queries. This limits the application of the technique in commercial systems. As a matter of fact, many commercial database systems (e.g., Sybase IQ [Kirk99], Informix [Brown00]) either evaluate outerjoin queries in the order specified by the user or only allow limited reordering. Second, reordering with the presence of antijoins is not considered in [GR97].
The present invention proposes a new reordering approach that can handle a more general class of queries and more types of joins. This reordering is performed in a conventional bottom-up optimizer using dynamic programming [SAC+791]. Commercial systems such as DB2 [IBM99] associate with each join predicate an eligibility list. Normally, the eligibility list of a join predicate includes only those tables that are referenced in this join predicate. During the bottom-up join enumeration, the optimizer checks if there is a join predicate p whose eligibility list is a subset of all the tables in the two subplans to be merged. If so, the two subplans are combined using p. Otherwise, the two subplans cannot be joined (unless a Cartesian product is introduced).
To incorporate reordering with outerjoins and antijoins, the present invention extends the normal eligibility list. For each join predicate, an extended eligibility list (referred to as EEL) is calculated, which includes additional tables referenced in those conflicting join predicates. Intuitively, an EEL gives all the tables needed by a predicate to preserve the semantics of the original query. EELs are precomputed during one traversal of the operator tree of the original query. Such an extension i

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Outerjoin and antijoin reordering using extended eligibility... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Outerjoin and antijoin reordering using extended eligibility..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Outerjoin and antijoin reordering using extended eligibility... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3137151

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.