Method for determining optimal database materializations...

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S793000, C707S793000, C707S793000, C709S241000, C709S241000

Reexamination Certificate

active

06356889

ABSTRACT:

FIELD OF THE INVENTION
The present invention relates to database management systems and more particularly to a method for determining optimal database materializations utilizing a query optimizer.
BACKGROUND OF THE INVENTION
A database management system (DBMS) comprises the combination of an appropriate computer, direct access storage devices (DASD) or disk drives, and database management software. A relational database management system is a DBMS which uses relational techniques for storing and retrieving information. The relational database management system or RDBMS comprises computerized information storage and retrieval systems in which data is stored on disk drives or DASD for semi-permanent storage. The data is stored in the form of tables which comprise rows and columns. Each row or tuple has one or more columns.
The RDBMS is designed to accept commands to store, retrieve, and delete data. One widely used and well known set of commands is based on the Structured Query Language or SQL. The term query refers to a set of commands in SQL for retrieving data from the RDBMS. The definitions of SQL provide that a RDBMS should respond to a particular query with a particular set of data given a specified database content. SQL however does not specify the actual method to find the requested information in the tables on the disk drives. There are many ways in which a query can be processed and each consumes a different amount of processor and input/output access time. The method in which the query is processed, i.e. query execution plan, affects the overall time for retrieving the data. The time taken to retrieve data can be critical to the operation of the database. tt is therefore important to select a method for finding the data requested in a query which minimizes the computer and disk access time, and therefore, optimizing the cost of doing the query.
A database system user retrieves data from the database by entering requests or queries into the database. The RDBMS interprets the user's query and then determines how best to go about retrieving the requested data. In order to achieve this, the RDBMS has a component called the query optimizer. The RDBMS uses the query optimizer to analyze how to best conduct the user's query of the database with optimum speed in accessing the database being the primary factor. The query optimizer takes the query and generates a query execution plan. The query execution plan comprises a translation of the user's SQL commands in terms of the RDBMS operators. There may be several alternative query execution plans generated by the query optimizer, each specifying a set of operations to be executed by the RDBMS. The many query execution plans generated for a single query ultimately differ in their total cost of obtaining the desired data. The query optimizer then evaluates these cost estimates for each query execution plan in order to determine which plan has the lowest execution cost. In order to determine a query execution plan with the lowest execution cost, the query optimizer uses specific combinations of operations to collect and retrieve the desired data. When a query execution plan is finally selected and executed, the data requested by the user is retrieved according to that specific query execution plan however manipulated or rearranged.
In a SQL based RDBMS the query execution plan comprises a set of primitive operations or commands, e.g. JOIN; a sequence in which the retrieve operations will be executed, e.g. JOIN ORDER; a specific method for performing the operation, e.g. SORT-MERGE JOIN; or an access method to obtain records from the base relations, e.g. INDEX SCAN. In most database systems, particularly large institutional systems, a cost-based query optimizer will be utilized. A cost-based query optimizer uses estimates of I/O and CPU resource consumption in determining the most efficient query execution plan because both I/O and CPU resource consumption depend on the number of rows that need to be processed.
The performance of queries against a database may be enhanced significantly by materializing certain data that may be redundant of data already in the database. This materialized data may be organized in ways better suited to certain database operations, such as searching for specific data, for example as with indexes, or may pre-compute information likely to be asked for often, as with materialized views, for example.
Data materialization such as indexes can benefit performance of a query in one or more of the following ways. First, a materialization can be used to rapidly find data which satisfies a user-specified search criterion, for example, predicates specified in the WHERE clause of an SQL query. Second, a materialization can be used to access rows in a particular order, thereby saving sort operations to achieve that ordering for operations such as JOINS or GROUP BY or ORDER BY clauses specified by the user. Thirdly, a materialization can be used to provide a subset of a table's columns, or tables' in the case of join indexes, that are a superset of the columns requested in a user query, thereby saving the access of the data pages of the base table. Because the data pages in the base table are presumably much larger than the index pages in the index table, the cost per row is greater for the base table.
On the other hand, materialized data has additional costs which include the following: (1) update costs to keep the materialization consistent with other data that has been modified, requiring possible data access and computation to determine the new contents of the materialized data based upon what data was modified; (2) storage costs for the materialized data, which is usually redundant of the base data; (3) increased optimization time to consider the use of these materializations, as an alternative to, or in addition to, accessing the base table.
Currently, most database systems leave the determination of the appropriate materializations up to the user. However, this can be very difficult and/or time-consuming for the user. There are typically numerous possible materializations and in many possible combinations. Furthermore, the costs and benefits of each combination will, in general, be very difficult for the user to assess.
Accordingly the present invention provides a method for determining optimal database materializations using a query optimized.
BRIEF SUMMARY OF THE INVENTION
The present invention provides a method for exploiting a database query optimizer to recommend materializations, for example indexes, of a database to enhance performance. The method takes one or more queries as inputs and using the cost-based optimizer in the database management system generates a series of virtual materializations by materializing some subsets of the database. The virtual materializations are used to consider the relative performance benefits, i.e. cost-benefits, for the queries based on the various virtual materializations. If the optimizer decides to use any of the materializations in its plan, then those materializations are recommended to the user, or created automatically for the user.
A feature of the present invention is that the method is incorporated into the query optimizer, rather than as an external tool. This arrangement provides the following advantages: (1) The existing infrastructure for iterating through alternative access paths to a table, and for estimating execution costs of these alternatives, can be exploited, and do not have to be replicated in an external tool. (2) Maintenance costs decrease because the equations used to estimate costs and benefits need be maintained in only one place, i.e. in the query optimizer. (3) There is greater accuracy in cost estimation, since there is no possibility of the query optimizer and an external tool being out of synchronization. The user is also guaranteed that the query optimizer will use the recommended materialization if it is defined, since the query optimizer recommended the materialization by using its equations to pick that materi

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method for determining optimal database materializations... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method for determining optimal database materializations..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method for determining optimal database materializations... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2846765

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.