Method and system for look ahead query evaluation planning...

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S793000, C707S793000, C707S793000, C707S793000

Reexamination Certificate

active

06345267

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates generally to database management systems and, more particularly, to efficient evaluation of SQL statements processed in relational database management systems.
2. Description of the Related Art
Information is frequently stored in computer processing systems in the form of a relational database. A relational database stores information as a collection of tables having interrelated columns and rows. A relational database management system (RDBMS) provides a user interface to store and retrieve the information and provides a query methodology that permits table operations to be performed on the data. One such RDBMS is the Structured Query Language (SQL) interface, which is specified by standards adopted by the American National Standards Institute (ANSI) and the International Standards Organization (ISO) following original development work by the International Business Machines (IBM) Corporation. The SQL interface permits users to formulate operations on the data tables either interactively, or through batch file processing, or embedded in host languages such as C, COBOL, or the like.
In particular, SQL provides table operations with which users can request database information and form one or more new tables out of the operation results. Data from multiple tables, or views, can be linked to perform complex sets of table operations with a single statement. The table operations are specified in SQL statements called queries. One typical SQL operation in a query is the “SELECT” operation, which retrieves table rows and columns that meet a specified selection parameter. Another operation permitted by SQL is the “JOIN” operation, which concatenates all or part of two or more tables to create a new resulting table. For example, a query might produce a table that contains the names of all supervisory employees who live in a given city, and might do so by specifying a SELECT operation to retrieve employee names and resident cities from one table, and then performing a JOIN of that data after a SELECT operation to retrieve employee names and job titles from another table.
Evaluation of SQL Statements
An SQL query generally includes at least one predicate, which is an SQL expression that can assume a logical value of TRUE, FALSE, or UNKNOWN. A predicate typically either specifies a data range, tests for an existence condition, tests for equivalence, or performs a similar table comparison operation. SQL queries and their resulting table operations can be nested through several levels of predicates such that a higher nested predicate, or level of operation, cannot be evaluated until after a lower level predicate, or operation, has been evaluated. A lower level of SQL operation in an SQL statement is generally referred to as a subquery. An example of an SQL query is provided below in Table 1:
Table 1
SELECT A.y, sum (B.y)
FROM A, B
WHERE A.x=B.x
GROUP BY A.y;
Those skilled in the art will understand that the notation “A.y” indicates the y column for all rows of Table A.
The Query Graph Model
When a query is input to an RDBMS, it is received by a query processor that puts the query through a query optimization process. The optimization is generally performed by a query optimizer of the RDBMS. During query optimization, the SQL query is parsed, or rewritten, into an RDBMS internal representation referred to as the query graph model (QGM). The QGM is a high-level, graphical representation of the input query in which boxes represent relational operations and arcs connecting the boxes represent quantifiers that reference tables. A QGM representation of the Table 1 query is shown in FIG.
1
. The three boxes represent the three subqueries that make up the query of Table 1 and indicate operations that are executed on incoming data, which flows from bottom to top. The arcs represent quantifiers that in some way restrict the information flowing into the respective boxes. The QGM representation of
FIG. 1
can best be understood by matching each QGM box from the bottom up. Thus, the “select . . . from A, B” subquery corresponds to the two arcs A and B, respectively, and the “A.y, sum(B, y)” is indicated by the quantifier arc. The subquery “where A.x=B.x” corresponds to the bottom QGM box and the “group by A.y” is represented by the middle box. The final SELECT box in
FIG. 1
represents the retrieval of data satisfying the query.
In general, each box of the QGM includes the predicates applied by the box relational operation and is associated with characteristics such as an input or output order specification where appropriate, a “distinct” flag, and the like. A basic set of QGM boxes from which queries can be represented includes SELECT, GROUP BY, and UNION. Join operations are represented by a SELECT box with two or more input quantifiers, whereas the ORDER BY operation is represented by a SELECT box with an output order specification. As part of the query optimization processing, the original QGM can be transformed into a more efficient QGM using techniques such as conversion heuristics, predicate push-down, view merging, and subquery-to-join transformation. The transformed QGM is semantically equivalent to the original QGM, but can be more efficiently evaluated by the query processor.
After the QGM is generated and optionally transformed, cost-based optimization is performed in which the QGM is graphically traversed and a query execution plan (QEP) is generated for evaluation. The RDBMS query processor interprets the QEP and thereby executes it, retrieving the requested data. The QEP can be viewed as a dataflow graph of operators, where each node of the graph corresponds to a relational operation such as a JOIN or a relatively low level operation such as a SORT. Each operator consumes one or more input records (that is, tables) and produces an output set of records (more tables). These tables will be referred to generally as output streams. A QEP representation of the Table 1 query is shown in FIG.
2
.
FIG. 2
shows that a QEP graphical representation includes circular or oval objects that represent operators, connected by arcs that represent information streams. Thus, in
FIG. 2
a table scan operator that acts on Table A produces information that is fed to a “join” oval, along with information from a table scan operator that acts on Table B. After the JOIN operator, which generates the output corresponding to the “where A.x=B.x” clause of the SQL query, the information stream flows into the “sort” oval, and then the information stream flows into the “group by” oval.
Each stream in a QEP has an associated set of properties. Properties are what characterizes the information that is being moved between operations. Examples of properties include the columns that make up each record in the input stream, the set of predicates that have been applied to the stream, and the order of the stream. Each operator in a QEP determines the properties of its output stream. The properties of an operator's output stream are a function of its input stream and its applied operation. For example, a sort operator passes on all the properties if its input stream unchanged except for the order property and cost.
A QEP is generated, or built, from the bottom up by the query optimizer, operator by operator, using the QGM as a guide. A set of properties are computed for each operator, and as the QEP is built, the optimizer matches the QEP's properties against requirements on those properties. The requirements can arise from the SQL query or from the characteristics of operators, such as JOIN. For example, a query with an ORDER BY clause results in an “order” requirement. If a QEP did not already satisfy the order dictated by the ORDER BY clause, then a sort operation would be added to the QEP to satisfy the ordering requirement of the ORDER BY operator.
As a QEP is built, different plan alternatives for operators are generated and compared. An analytical cost model is typically used to estimate the execution time of

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and system for look ahead query evaluation planning... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method and system for look ahead query evaluation planning..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and system for look ahead query evaluation planning... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2978732

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.