Data processing: database and file management or data structures – Database design – Data structure types
Reexamination Certificate
1998-06-19
2001-01-30
Black, Thomas G. (Department: 2771)
Data processing: database and file management or data structures
Database design
Data structure types
C707S793000, C707S793000
Reexamination Certificate
active
06182079
ABSTRACT:
INTRODUCTION
The present invention relates to specifying indexes for relational databases. The present invention also relates to a relational database including processes for specifying indexes.
Data processing environments are known in which executable instructions are arranged to produce a data set derived from data contained within a database in response to a data enquiry. Data may be accessed directly from data tables, where it may be necessary to search all entries within the table in order to obtain the information required. Alternatively, searching procedures may also make use of indexes in order to substantially increase the speed of a searching process.
Designing index structures for large and heavily used databases is presently an extremely difficult exercise and highly susceptible to the introduction of errors. This problem exists because the technical demands and constraints are such that it is not possible for a human database administrator to simultaneously perceive the indexing requirements for, typically, hundreds or thousands of different structured query language (SQL) statements, that run against the database on a day-to-day basis, and then to convert these requirements into a preferred set of indexes defined over the whole database. However, poorly specified index designs will result in SQL statements that consume far too much of the processor facility, that run for far longer than they should and result in a machine that is heavily overloaded.
For a long time, there has been a requirement for procedures that globally specify index structures defined over a given database design, for a typical SQL workload, which may be referred to as a target workload. However, this technical problem has persisted given the inherent difficulties of realising a technical solution, implemented and taking advantage of the processing capability available, without requiring intuitive mental processes on the part of human operators.
SUMMARY OF THE INVENTION
According to a first aspect of the present invention, there is provided a method of specifying a set of indexes for a database, comprising the steps of: analysing a plurality of statements supplied to said database to identify indexable predicates; deriving indexes from the indexable predicates identified in said analysing step; evaluating levels of improved operation achievable with said indexes; processing said evaluated levels to specify a preferred set of indexes for said database.
In a preferred embodiment levels of improved operation are evaluated by creating a scaled-down model of database tables derived from information relating to the nature of said tables. Typically, the scaled-down model may include in the region of 5000 data entries per table. Preferably, the model database is populated with representative data entries taken from the live database being modelled and said model may be populated by considering the cardinality of an existing index of the live database. In addition, the database model may be populated by considering the distribution of entries within an existing index of the live database.
In a preferred embodiment, database statistics are copied from the live. database to the database model. Preferably, a base level cost is calculated for executing statements without additional indexes being present. Preferably, cost levels are obtained by estimating execution time. In addition, cost levels may be estimated by assessing index maintenance overheads.
According to a second aspect of the present invention, there is provided an apparatus for specifying an index set for a database comprising: analysing means for analysing a plurality of statements supplied to the database to identify indexable predicates; deriving means operable to derive indexes from the indexable predicates identified by said analysing means; evaluating means for evaluating levels of improved operation achievable with said indexes; and specifying means operable to process said evaluated levels and specify a preferred set of indexes for said database.
In a preferred embodiment possible indexes are identified from predicate sets defined by said statements.
According to a third aspect of the present invention, there is provided data processing apparatus arranged to specify an index set for a database, said apparatus comprising data storage means, data processing means and program instructions readable from said data storage means, wherein said processing means is configured, in response to said instructions, to provide means for: analysing a plurality of statements supplied to said database to identify indexable predicates; deriving indexes from the indexable predicates identified in said analysing step; evaluating levels of improved operation achievable with said indexes; and processing said evaluated levels to specify a set of preferred indexes for said database.
In a preferred embodiment, cost savings are calculated by processing cost values of the old SQL statement costs and of the new SQL statement costs with a possible index. Cost saving may be calculated by subtracting the new costs from the old costs. Preferably, cost savings are calculated for to tables by considering each new possible index in turn with reference to its respective table.
In a preferred embodiment, possible indexes are ordered in terms of potentiality for being specified as preferred indexes. Index combinations may be identified by randomly combining existing potential indexes and processing said evaluated levels to specify a set of preferred indexes.
According to a fourth aspect of the present invention, there is provided a database comprising a plurality of data tables stored in machine readable form, processing means for processing said data tables in response to statements and for generating indexes to facilitate the processing of said data tables, further comprising instructions executable by said processing means for specifying a preferred index set, wherein said instructions are configured to analyse a plurality of statements supplied to said database to identify indexable predicates, derive indexes from said indexable predicates, evaluate levels of improved operation achievable with said indexes and process said evaluated levels to specify a preferred index set for said database.
REFERENCES:
patent: 4805099 (1989-02-01), Huber
patent: 4956774 (1990-09-01), Shibamiya et al.
patent: 5404510 (1995-04-01), Smith et al.
patent: 0 351 388 (1990-01-01), None
Black Thomas G.
British Telecommunications public limited company
Nixon & Vanderhye P.C.
Trinh William
LandOfFree
Specifying indexes by evaluating costs savings for improving... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Specifying indexes by evaluating costs savings for improving..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Specifying indexes by evaluating costs savings for improving... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2476646