System and method for processing queries having an inner...

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S793000, C707S793000

Reexamination Certificate

active

06370524

ABSTRACT:

FIELD OF THE INVENTION
The present invention relates to computer database systems and more particularly to processing embedded queries containing a grouping operator.
BACKGROUND OF THE INVENTION
Relational databases store information in collections of tables, in which each table is organized into rows and columns. FIG.
6
(
a
) illustrates an exemplary database containing two tables, customers table
600
and orders table
620
, that is useful in recording information about a sales operation. The columns of customers table
600
hold attributes for customers, for example, a customer identifier CID
602
, an a zip code ZIP
604
, and each row
610
-
618
is an entry for a respective customer. For example, row
610
is an entry for customer with a CID of 2 and a ZIP of 19555. The orders table
620
holds information in columns for the orders that the customers place. Such information may include, for example, an order identifier OID
622
, the customer identifier CID
624
of the customer placing the order, and a dollar amount PRICE
626
.
A user retrieves information from the tables of a relational database by entering input that is converted to queries by a database application. The database application submits the queries to a database server. In response to receiving a query, the database server accesses the tables specified in the query to determine which information within the tables satisfies the query. The information that satisfies the queries is then retrieved by the database server and transmitted to the database application and ultimately to the user. Database servers are also capable of combining or “aggregating” information contained in the tables in response to a query. For example, a common query in the exemplary database is to total all the order amounts to compute a sales figure for the customers in a given ZIP code.
For any given database application, the queries must conform to the rules of a particular query language. Most query languages provide users with a variety of ways to specify information to be retrieved. For example, in the Structured Query Language (SQL), the following query requests the retrieval of sales information for customers in zip code 19555 by adding up the order amounts:
select v.CID, v.SALES
from CUSTOMERS,
(select CID, sum(PRICE) as SALES
from ORDERS
group by CID) v
where CUSTOMERS.CID=V.CID and ZIP=19555;
This query actually contains two queries, one of which is embedded in the other. The inner query is in the form of a view definition, and the view thus defined is referenced by the name “v”. In this example, the view v requests the database server to group all the rows in orders table
620
by the customer identifier CID
622
and sum all the values of the PRICE column
626
as a SALES figure for each group of rows. Referring to FIG.
6
(
b
), view v
640
therefore contains two columns, CID
642
for the customer identifier and SALES
644
for the aggregate sales figure as a sum of the PRICE column
626
values for each of the groups of rows.
The outer query requests the database server to select those sales figures
644
of view v
640
in which the customer identifier CID
602
matches the view CID
642
and the zip code ZIP
604
of the customer is 19555. In the example, the outer query selects a customer identifier CID of 2 and a sales figure of
45
from row
650
of view v
640
, because row
610
, with a CID
602
of 2 and a ZIP
604
of 19555, satisfies the conditions of the WHERE clause.
This kind of query, which embeds an inner query with a grouping function, is fairly common in database applications because it is straightforward for users to write and understand. Executing this query in a straightforward manner, however, by first evaluating the view defined by the inner query and then evaluating the outer query, may not be scalable for tables containing a very large number of rows. Evaluating the view first results in performing a fill table scan of the table referenced in the FROM list of the inner query (the “inner” table). In some environments, there can be millions of rows in the inner table, even though the outer query only selects a few rows of the inner table. Consequently, in such conventional approaches, the database server performs much unnecessary work in aggregating columns for the rows of the inner table that will never be selected by the outer query. As the number of rows in the inner table greatly increase, the performance of the processing this kind of query is substantially degraded. Therefore, it is desirable to process this type of query more efficiently.
One approach for optimizing this kind of query involves the use of a “magic set transformation.” In a magic set transformation, a new, innermost query block is introduced into the inner query block to first select an appropriate subset of the data. In the example, a magic set transformation results in the following query, with the magic set query block labeled “magic” and a predicate referencing the magic set marked in bold:
select v.CID, V.SALES
from CUSTOMERS,
(select CID, sum(PRICE) as SALES
from ORDERS,
(select CID from CUSTOMERS
where ZIP=19555) magic
where ORDERS.CID=magic.CID
group by CID) v
where CUSTOMERS.CID=v.CID and ZIP=19555;
Accordingly, when this transformed query is processed, the magic set query is evaluated first, selecting only one CID from customers table
600
. In turn, this single CID selects only two rows
630
and
636
from the order tables
620
for view v, whose sales column is a summation of the price column
626
for aggregated rows. Consequently, a magic set transformation, which introduces an artificial entity that duplicates the outer table customers
600
, is capable of reducing the amount of processing involved in grouping over the orders table
620
.
Although the magic set transformation may be acceptable when the queries are very simple, e.g., involving only two tables, a disadvantage with the magic set transformation is particularly apparent when there is a much larger number of tables referenced in the outer query block and the inner query block. It is known that the performance of a join operation of multiple tables is highly dependent on the particular order in which the tables are joined. As a result, techniques have been developed in choosing an efficient join order, and generally these techniques are most effective when there are more options to consider, i.e., more tables to choose from.
The magic set transformation, however, reduces the number of options that are available for such techniques for choosing an efficient join order. For example, the magic set transformation maintains the separation of the inner tables of the inner query block from the outer tables of the outer query block, even though there may be a more efficient join order involving a combination of the inner tables and the outer tables.
SUMMARY OF THE INVENTION
Therefore, there is a need for a method of transforming a query containing an embedded query with a grouping operator that reduces the amount of processing to execute the entire query. There is also a need for a method of improving the processing time of such embedded queries that does not foreclose possible join orderings from consideration and, desirably, increases the number of possible join orderings. These and other needs are addressed by a query transformation that merges the embedded inner query block into the outer query block. By merging the embedded query, no particular join sub-ordering is dictated and inner tables are capable of being combined with the outer tables to form a efficient join ordering.
One aspect of the invention is a computer-implemented method and a computer-readable medium bearing instructions for processing a query in a database system by receiving an original query that contains an inner query block with a grouping operator such as GROUP BY or DISTINCT in SQL. The inner query block is embedded in an outer query block, for example, as a reference to a view or as a subquery. The original query is transformed into

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

System and method for processing queries having an inner... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with System and method for processing queries having an inner..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System and method for processing queries having an inner... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2923118

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.