Data processing: database and file management or data structures – Database design – Data structure types
Reexamination Certificate
2001-12-03
2003-12-16
Mizrahi, Diane D. (Department: 2175)
Data processing: database and file management or data structures
Database design
Data structure types
C707S793000
Reexamination Certificate
active
06665664
ABSTRACT:
COPYRIGHT NOTICE
A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates generally to information processing environments and, more particularly, to computer-implemented methodologies for query optimization in a data processing system, such as a Database Management System (DBMS).
2. Description of the Background Art
Computers are very powerful tools for storing and providing access to vast amounts of information. Computer databases are a common mechanism for storing information on computer systems while providing easy access to users. A typical database is an organized collection of related information stored as “records” having “fields” of information. As an example, a database of employees may have a record for each employee where each record contains fields designating specifics about the employee, such as name, home address, salary, and the like.
Between the actual physical database itself (i.e., the data actually stored on a storage device) and the users of the system, a database management system or DBMS is typically provided as a software cushion or layer. In essence, the DBMS shields the database user from knowing or even caring about underlying hardware-level details. Typically, all requests from users for access to the data are processed by the DBMS. For example, information may be added or removed from data files, information retrieved from or updated in such files, and so forth, all without user knowledge of underlying system implementation. In this manner, the DBMS provides users with a conceptual view of the database that is removed from the hardware level. The general construction and operation of a database management system is known in the art. See e.g., Date, C., “An Introduction to Database Systems, Volume I and II,” Addison Wesley, 1990; the disclosure of which is hereby incorporated by reference.
DBMS systems have long since moved from a centralized mainframe environment to a de-centralized or distributed environment. One or more PC “client” systems, for instance, may be connected via a network to one or more server-based database systems (SQL database server). Commercial examples of these “client/server” systems include Powersoft® clients connected to one or more Sybase® SQL Anywhere® Studio (Adaptive Server® Anywhere) database servers. Both Powersoft® and Sybase® SQL Anywhere® Studio (Adaptive Server® Anywhere) are available from Sybase, Inc. of Emeryville, Calif.
In today's computing environment, database technology can be found on virtually any device, from traditional mainframe computers to cellular phones. Sophisticated applications, whether human resources information systems or sales force automation systems, can “push” much of their complexity into the database itself. Indeed, this represents one of the main benefits of database technology. The challenge, however, is to support these applications, and the complex queries they generate, on small computing devices. At the same time, users expect the productivity and reliability advantages of using a relational DBMS.
Consider, for instance, the execution of a request for information from a relational DBMS. In operation, this request is typically issued by a client system as one or more Structured Query Language or “SQL” queries for retrieving particular data (i.e. data records meeting the query condition) from database tables on a server. For example, a simple SQL SELECT statement.
SELECT name
FROM employees
WHERE sal=10,000
results in a list of the names of those employees earning $10,000, where “employees” is a table defined to include information about employees of a particular organization. The syntax of SQL is well documented, see e.g., the abovementioned “An Introduction to Database Systems.” SQL has become the standard for relational database access, see, e.g. Melton, J. (ed.), “American National Standard ANSI/ISO/IEC 9075-2: 1999, Information Systems—Database Language—SQL Part2: Foundation,” the disclosure of which is hereby incorporated by reference.
SQL queries express what results are requested but do not state how the results should be obtained. In other words, the query itself does not tell how the query should be evaluated by the database management system. Rather, a component called the optimizer determines the “plan” or the best method of accessing the data to implement the SQL query. The query optimizer is responsible for transforming an SQL request into an access plan composed of specific implementations of the algebraic operator selection, projection, join, and so forth. Typically, this is done by generating many different join strategies, evaluating the cost of each, and selecting the access plan with the lowest overall costs, where “cost” is a metric that measures a combination of factors, including, but not limited to, the estimated amount of computational overhead, the number of physical Input/Output (“I/O”) operations, and the response time. However, producing an optimal access plan for any given SQL query is a complex problem.
Query optimization in a relational DBMS is largely based on the existence of conjunctive conditions in each query's Where clause. Conjunctive conditions are useful because they must each evaluate to true in order for the query's Where clause to be satisfied. Hence, such conjunctive conditions (or “prime implicates”) can be naturally exploited by the query optimizer. However, traditionally the discovery of such prime implicates requires the database engine to first transform or “normalize” the original search condition into conjunctive normal form and remove redundant terms.
The problem of simplifying arbitrary Boolean expressions is a well-known problem in the history of computer science. Originally, the motivation for the study of this problem was its role in the minimization of Boolean functions and switching circuits (see, e.g., W. V. Quine, “The problem of simplifying truth functions,” American Mathematics Monthly, 59:521-531, 1952; W. V. Quine, “A way to simplify truth functions,” American Mathematics Monthly, 62:627-631, 1955; and James R. Slagle, Chin-Liang Chang, and Richard C. T. Lee, “A new algorithm for generating prime implicants,” IEEE Transactions on Computers, 19(4):304-310, 1970). Boolean state reduction has directly influenced power and material requirements by eliminating redundant gates and easing three-level NAND implementation. See, e.g., Olivier Coudert, “On solving covering problems,” Proceedings of the 33rd Annual Conference on Design Automation, pages 197-202, Las Vegas, Nevada, June 1996. However, the difficulties of normalization and redundant term elimination in SQL query optimization have been mostly overlooked.
The goal of query optimization in a relational DMBS is not simply reducing the number of conditions to be examined. Although minimization of the number of conditions reduces the number of operations required during query processing, what is equally important to a relational DBMS engine is the discovery of prime implicates that can be exploited to deduce a cheaper data access strategy. These prime implicates may be redundant or subsumed by the original expression and therefore may be ignored or even pruned by a logic minimizer.
Theoretically, the most efficient SQL statement—in terms of the time required to execute it—could be determined by generating and costing all possible semantically-equivalent forms. However, in a real-world setting a database engine must generate useful prime implicates along with other semantic transformations without spending more time than is required to execute the unoptimized query. See, e.g., Shashi Shekhar, Jaideep Srivastava, and Soumitra Dutta, “A formal model of trade-off bet
Paulley Glenn Norman
Vorwerk Kristofer Paul
Mizrahi Diane D.
Smart John A.
Sybase Inc.
Wu Yicun
LandOfFree
Prime implicates and query optimization in relational databases does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Prime implicates and query optimization in relational databases, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Prime implicates and query optimization in relational databases will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3133808