Data processing: database and file management or data structures – Database design – Data structure types
Utility Patent
1998-05-30
2001-01-02
Breene, John (Department: 2777)
Data processing: database and file management or data structures
Database design
Data structure types
C707S793000, C707S793000
Utility Patent
active
06169983
ABSTRACT:
FIELD OF THE INVENTION
The present invention relates generally to the field of database systems. More particularly, the present invention relates to the field of indexing data for database systems.
BACKGROUND OF THE INVENTION
Computer database systems manage the storage and retrieval of data in a database. A database comprises a set of tables of data along with information about relations between the tables. Tables represent relations over the data. Each table comprises a set of records of data stored in one or more data fields. The records of a table are also referred to as rows, and the data fields of records in a table are also referred to as columns.
A database server processes data manipulation statements or queries, for example, to retrieve, insert, delete, and update data in a database. Queries are defined by a query language supported by the database system. To enhance performance in processing queries, database servers use indexes to help access data in a database more efficiently.
The number of possible indexes over a database, however, can be very large and significantly increase the storage requirements for the database. Also, many of the possible indexes may provide no or minimal performance advantage considering the data in the database, the organization of the data in the database, and the usage of the database as represented by a workload of queries executed against the database. A physical database design tool or database administrator therefore strives to form an index configuration or set of indexes that is not only effective in enhancing performance in processing queries but also consumes less storage space.
SUMMARY OF THE INVENTION
An index merge tool helps form, for use by a database server in accessing a database in accordance with a given workload of queries, an index configuration or set of indexes that consumes relatively less storage space. The index merge tool attempts to merge indexes of an initial set of indexes to help minimize the amount of storage space consumed by the indexes while minimizing any increase in the cost of executing queries of the workload against the database using the indexes. The index merge tool identifies from the initial set of indexes one or more combinations of two or more indexes on the same table of the database and merges each identified combination of indexes. The index merge tool may identify and merge each combination of indexes by identifying and merging one pair of indexes at a time.
In accordance with the present invention, a method merges indexes for use in executing queries against a database. The method may be implemented in the form of program modules or computer-executable instructions stored on a computer readable medium.
For the method, at least two indexes are identified from an initial set of indexes and merged to form a merged index set consuming less storage space than for the initial set of indexes. One or more combinations of two or more indexes on the same table of the database may be identified from the initial set of indexes, and each identified combination of indexes may be merged to form the merged index set. Of the initial set of indexes, the method may possibly merge only indexes of a subset of the initial index set that have a width of at least a predetermined number of column(s) and/or that are for index-only access.
Each combination of indexes may be identified and merged one pair of indexes at a time. Each pair of indexes may be merged such that all columns of one index of the pair are ordered ahead of any different columns of the other index of the pair. The order of columns may be determined based on how the indexes of each pair rank against one another. The method may possibly merge only pairs of indexes that will form a merged index having a width less than a predetermined percentage greater than that for each index of the pair used to form the merged index. The method may also possibly merge only pairs of indexes where the initial set of indexes comprises a single-column index on a leading column of both indexes of the pair and the leading column of each index of the pair are the same or where the initial set of indexes comprises a single-column index on only one or none of the leading columns of the indexes of the pair.
The merged index set is used to execute queries against the database if the storage space consumed by the merged index set is at least a predetermined percentage less than that for the initial set of indexes and/or if an estimated cost to execute queries of a workload for the merged index set is less than a predetermined percentage greater than an estimated cost to execute the queries of the workload for the initial set of indexes.
For another method, a subset of indexes on one table of a database is identified from an initial set of indexes. An initial subset of indexes that have a width of at least a predetermined number of column(s) and/or that are for index-only access may be identified from the initial set of indexes. The subset of indexes on one table of the database may be identified from the initial subset of indexes.
Each of one or more combinations of indexes from the identified subset of indexes are merged to form a set of one or more respective merged indexes. Each combination of indexes may be a pair of indexes.
Each combination of indexes may be merged such that all columns of one index of the combination are ordered ahead of any different columns of the other index(es) of the combination to form the respective merged index. The order of columns for each respective merged index may be determined based on how the indexes of the combination used to form the respective merged index rank against one another. Indexes of each combination may be ranked based on how a leading column of each index of the combination appears in one or more queries. Each index of each combination may be ranked with a highest rank if one or more queries have a condition on the leading column of the index, with a second highest rank if one or more queries have an order by clause on the leading column of the index, with a third highest rank if one or more queries have a group by clause on the leading column of the index, and with a fourth highest rank if one or more queries use the leading column of the index as a projection column.
The method may possibly merge only combinations of indexes that will form a merged index having a width less than a predetermined percentage greater than that for each index of the combination used to form the merged index. The method may also possibly merge only combinations of indexes where the initial set of indexes comprises a single-column index on a leading column of at least two indexes of the combination and the leading column of those at least two indexes of the combination are the same or where the initial set of indexes comprises a single-column index on only one or none of the leading columns of the indexes of the combination.
A merged index is selected from the set of one or more merged indexes. The merged index that saves the most storage space, for example, may be selected. The combination of indexes used to form the selected merged index is replaced with the selected merged index.
The merging, selection, and replacement of indexes is repeated one or more times for the subset of indexes on the current table. The method is similarly performed for each of one or more other tables of the database on which the initial set of indexes are indexed to form a merged index set.
The merged index set is used to execute queries against the database if the storage space consumed by the merged index set is at least a predetermined percentage less than that for the initial set of indexes and/or if an estimated cost to execute queries of a workload for the merged index set is less than a predetermined percentage greater than an estimated cost to execute the queries of the workload for the initial set of indexes.
Another method merges a pair of indexes to form a merged index for use in executing queries against a database.
For this method, one of the indexes
Chaudhuri Surajit
Narasayya Vivek
Breene John
Lewis Cheryl
Microsoft Corporation
Watts, Hoffmann, Fisher & Heinke Co. L.P.A.
LandOfFree
Index merging for database systems does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Index merging for database systems, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Index merging for database systems will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2500233