Data processing: database and file management or data structures – Database design – Data structure types
Reexamination Certificate
1999-07-30
2002-10-15
Corrielus, Jean M. (Department: 2172)
Data processing: database and file management or data structures
Database design
Data structure types
C707S793000, C707S793000, C707S793000, C707S793000
Reexamination Certificate
active
06466931
ABSTRACT:
FIELD OF THE INVENTION
The present invention relates to databases, and more particularly to transparently caching and reusing database query execution plans.
BACKGROUND OF THE INVENTION
Today's object-oriented database environments are typically used as front-ends to more simplistic but efficient data models, such as flat files and relational databases. In a relational database, data is perceived by its users as a collection of tables. The tables in a relational database include a row of column names specifying one or more attribute fields, and zero or more data rows containing one scalar value for each of the column fields.
In operation, a client submits a semantic-rich query at the object-level to the database system. Because the object-oriented model is much more sophisticated and complicated than a relational model, it is usually more expensive to process a query in the object space. Therefore, the database system converts the object-oriented query into a relational query that queries the relational database. This is accomplished by generating an execution plan specifying what queries to run against the relational databases, and how to combine those results to compute the final result.
As an example, assume that a user has the following query: “Find all employees having a salary greater than $100,000.” This query, which is expressed in English, could be represented as the following object space query (depending on the computer language): “Select e from EmpClassMOHome e where e.salary>100000”. In the example, “EmpClassMOHome,” identifies a class of objects similar to a table in a relational space, and “e” represents a specific instance in that class.
This query would then be converted into a corresponding relational query in the execution plan. Relational queries to a relational database are carried out through high-level commands or statements, such as SELECT, INSERT, UPDATE and DELETE, which are examples of statements from standard SQL (Structured Query Language). The object space query from the example above would be translated into a query execution plan containing the following query:
SELECT *
FROM Employee
WHERE salary>100,000
The SELECT command specifies the desired attributes to be returned FROM a specified table WHERE some specified condition is true. In this example, a “*” in the SELECT clause means that all attributes from the “Employee” table should be returned for all records meeting the condition in the WHERE clause. The WHERE clause includes one or more literals, each of which includes at least one attribute (e.g., salary), an operator (e.g., >), and either a constant value (e.g., 100,000) or another attribute.
Unfortunately, generating an execution plan from an object query is CPU intensive and may take significantly longer time than even executing the plan against the relational database. Most time in generating the execution plan is spent parsing and analyzing the complex metadata that maps the object attributes to relations.
To minimize this overhead, some systems cache the queries and their corresponding plans. If a new user query matches a cached query, the cached execution plan is reused for the new query. There are generally two methods for caching queries and plans.
One method for caching queries is manual in that it requires the database user to explicitly request the database system to prepare a query (also generate and save the execution plan internally), and then return a handle to it. The user may then reuse that query several times by submitting the same handle to the database system each time the user desires to issue that same query.
Another method replaces all constant values in the query with host variables. An example is:
SELECT id
FROM Employee
WHERE name=:hostv1
The user then specifies a constant value for the host variable (e.g., :hostv1=“Vader”) and submits the query. The host-variable query can then be reused several times, each time specifying different constant values for host variables.
Although the manual and host-variable techniques cache and reuse certain queries, the caching and reuse of the execution plans is non-transparent to the user because the user is aware of, and is required to explicitly facilitate reuse of the execution plans. In addition, some systems that do not support host-variables are limited because they only reuse query execution plans when a new query string exactly matches a cached query string, including constant values. Thus, given the following to queries:
(1) SELECT id FROM Employee WHERE name=“Luke” and
(2) SELECT id FROM Employee WHERE name=“Skywalker”,
conventional methods would consider the two queries different and would spend the time to generate an entirely new execution plan for the second query.
Accordingly, what is needed is a system and method for transparently and efficiently caching and reusing database query execution plans. The present invention addresses such a need.
SUMMARY OF THE INVENTION
The present invention is a method and system for transparently caching and reusing database query execution plans. The method and system include caching a first query that contains a first constant. A first query execution plan is also generated for the query that represents the first constant with a parameter name. The method and system further include receiving a new query that contains second constant. The new query is compared with the first query and a match is determined to exist even when the second constant in the new query fails to match the first constant from the first query. Upon a match, the first query execution plan is reused by substituting the parameter name in the query execution plan with the second constant from the new query without user intervention, thereby avoiding generating a new query execution plan for the new query.
According to the present invention, a more flexible definition of a query match is provided, leading to more frequent reuse of execution plans and increase in system speed. Moreover, the reusable execution plan is obtained for a matching new query at a fraction of the cost of generating a new execution plan.
REFERENCES:
patent: 5161225 (1992-11-01), Abraham et al.
patent: 5367675 (1994-11-01), Cheng et al.
patent: 5481703 (1996-01-01), Kato
patent: 5608904 (1997-03-01), Chaudhuri et al.
patent: 5668987 (1997-09-01), Schneider
patent: 5694491 (1997-12-01), Du et al.
patent: 5706506 (1998-01-01), Jensen et al.
patent: 5758149 (1998-05-01), Bierma et al.
patent: 5809566 (1998-09-01), Charney et al.
patent: 5812996 (1998-09-01), Rubin et al.
patent: 5822749 (1998-10-01), Agarwal
patent: 5842203 (1998-11-01), D'Elena et al.
patent: 5937401 (1999-08-01), Hillegas
patent: 6012054 (2000-01-01), Seputis
patent: 6012085 (2000-01-01), Yohe et al.
patent: 6026391 (2000-02-01), Osborn et al.
patent: 6073129 (2000-06-01), Levine et al.
Attaluri Gopi Krishna
Wisneski David Joseph
Corrielus Jean M.
International Business Machines - Corporation
Nguyen Tam
Sawyer Law Group LLP
LandOfFree
Method and system for transparently caching and reusing... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and system for transparently caching and reusing..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and system for transparently caching and reusing... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2993867