Data processing: database and file management or data structures – Database design – Data structure types
Reexamination Certificate
1999-06-16
2002-04-16
Black, Thomas (Department: 2171)
Data processing: database and file management or data structures
Database design
Data structure types
C707S793000, C707S793000, C707S793000
Reexamination Certificate
active
06374257
ABSTRACT:
FIELD OF THE INVENTION
The present invention relates to database systems, and more specifically, to a mechanism for removing ambiguities in a shared database command.
BACKGROUND OF THE INVENTION
In typical database systems, users store, update and retrieve information by submitting commands to a database server. To be correctly processed, the commands must comply with the database language that is supported by the database server. One popular database language is known as Structured Query Language (SQL).
The response time of a finely-tuned database can be adversely affected by only a tiny proportion of the workload. A challenge that is encountered by most database administrators (“DBA”) is how to constantly ensure that the performance of a database is not degraded due to execution of resource-intensive database commands.
In many cases, a database command may be highly resource-intensive due to certain inefficiencies that were introduced by the application that generated the database command. Database commands that fall into this category are referred to herein as “poorly performing database commands”. For example, if the application SQL is written in a way that prevents the optimizer from using an available index, in executing the database command, the database server will typically be required to perform a full scan of the table in order to determine the correct rows. By having to execute a full scan of the table, the database server is required to process a greater number of rows than would normally be required if the rows were selected using a proper index. Thus, by building the index on the table, or by modifying the application to generate an alternate database command that reduces the number of rows that need to be scanned, the resources required by the database command can be reduced. In addition, because a database command may be repeatedly executed, often by multiple database users, the tuning of the database command can significantly increase the performance of a database system.
For database systems that support the SQL database language, database commands typically take the form of SQL statements. In general, the initial step in tuning the workload of a database system that supports SQL consists of identifying the SQL statements that are highly resource-intensive, and thus inflicting the most damage on the performance of the database system.
One method of capturing the metrics of poorly performing database commands is by tracing the execution of SQL statements as they are executing in the database system. In particular, by executing a tracing mechanism, the relevant details of a statement's execution can be captured. The details of the statement's execution can then be examined to identify any inefficiencies that were introduced by the application program in generating the SQL statement.
However, a drawback with tracing the execution of an SQL statement is that the tracing mechanism itself requires a vast amount of resources to execute and thus introduces a significant amount of overhead into the database system. In many cases, the cost of turning on the tracing mechanism is so prohibitive as to dissuade the DBA from arbitrarily tracing the SQL statements executing in the database system. Thus, the tracing mechanism is typically only enabled and started after the performance of the database system has already significantly declined.
Another method of identifying highly resource-intensive SQL statements involves reviewing the metric information that is associated with the different SQL statements that are executing in the database system. The metrics for a particular SQL statement can provide such details as the execution time, the I/O count, the number of sorts, and the rows processed while executing the particular SQL statement. By examining the metric information that is associated with SQL statements executing in the database system, the DBA can identify those SQL statements that impact the database most heavily.
For example, in certain database systems, such as the Oracle™ database system, in executing SQL statements, the database system maintains the SQL statements and their aggregated metrics in a shared cache. The metrics may include, for example, the number of I/O operations performed during execution of the query and the number of times the query has been executed. Thus, by inspecting the metric information, a DBA can determine the per-execution metrics for that particular SQL statement, and thus identify poorly performing SQL statements.
After a poorly performing query has been identified, the database may be “tuned” to allow the query to perform more efficiently. The process of tuning the database for executing the queries may involve, for example, creating new indexes that can be used to process the queries. To know how to tune a database for a query, the schema against which the query is executed must be known. For example, if a poorly performing query is executed against an “emp” table in one schema, performance of the query would not be benefited by building an index on an “emp” table in a different schema.
Unfortunately, it is not always clear what schema context was used to execute a query. For example, to limit the amount of resources that are required to maintain the shared cache area, the database system maintains a single entry for each distinct SQL statement, regardless of how many different schemas execute the SQL statement. SQL statements for which only a single cache entry is maintained, but which have been issued from more than one schema, are referred to as “shared SQL statements”.
Thus, it is possible for two users to issue separate SQL statements from different schemas but have both SQL statements map to the same shared cache entry, thus causing the aggregated metrics in the shared cache to reflect the sum of the statistics across both users for the given statement. For example, if user “A” using a schema “X” context and user “B” using a schema “Y” context, each execute the SQL statement:
SELECT “name”, “salary”, “dept_num”
FROM emp, dept
WHERE emp.name=dept.emp_name
AND emp.salary>50000.
both statements will map to the same cache entry even though the two statements were executed in different schema contexts.
In this example, the table objects “emp” and “dept” used in the query are ambiguous. An ambiguous object is an object that is identified using a name that applies equally to objects that belong to different schemas. In this example, schema “X” and schema “Y” both have tables named “emp” and “dept”, thus rendering the object names “emp” and “dept” ambiguous in the SQL statement.
One common way to refer to objects unambiguously is to qualify the ambiguous object name with the name of the schema that contains the desired object. For example, the “emp” table in schema X would be referred to as X.emp, while the “emp” table in schema Y would be referred to as Y.emp. Because the SQL statement given above does not qualify the ambiguous names by a particular schema, there is no way of determining whether the SQL statement was executed by user “A” in the context of schema “X” or by user “B” in the context of schema “Y” by looking at the cache entry for the SQL statement.
Thus, the problem with having ambiguous objects in an SQL statement is that the particular schema that was used to execute the SQL statement can not be identified. Without knowing the particular schema that was used to execute the SQL statement, the DBA cannot execute the SQL statement within the same operating environment as the database user who originally ran the statement. This is important as the DBA needs to tune the statement from the context of the executing user.
For example, a “DEMO” schema and a “PRODUCTION” schema may both include a set of table objects “emp” and “dept”. As such, the SQL statement:
SELECT *
FROM emp, dept
may be executed using both the “DEMO” schema and the “PRODUCTION”. However, the “DEMO” schema may include only a small number of entries in each of its table objects “emp” and “dept”, while the “PRODUCTION” schema may inclu
Feng Yao
Guay Todd
Shah Vipul
Soule Jonathan
Black Thomas
Brandt Carl L.
Hickman Palermo & Truong & Becker LLP
Mizrahi Diane D.
Oracle Corporation
LandOfFree
Method and system for removing ambiguities in a shared... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and system for removing ambiguities in a shared..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and system for removing ambiguities in a shared... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2894837