Data processing: database and file management or data structures – Database design – Data structure types
Reexamination Certificate
2002-10-30
2004-10-05
Mizrahi, Diane D. (Department: 2175)
Data processing: database and file management or data structures
Database design
Data structure types
C707S793000, C707S793000
Reexamination Certificate
active
06801905
ABSTRACT:
COPYRIGHT NOTICE
A portion of the disclosure of this patent document contains material that is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates generally to data processing environments and, more particularly, to system and methods for eager and opportunistic property enforcement in a database system.
2. Description of the Background Art
Computers are very powerful tools for storing and providing access to vast amounts of information. Computer databases are a common mechanism for storing information on computer systems while providing easy access to users. A typical database is an organized collection of related information stored as “records” having “fields” of information. As an example, a database of employees may have a record for each employee where each record contains fields designating specifics about the employee, such as name, home address, salary, and the like.
Between the actual physical database itself (i.e., the data actually stored on a storage device) and the users of the system, a database management system or DBMS is typically provided as a software cushion or layer. In essence, the DBMS shields the database user from knowing or even caring about underlying hardware-level details. Typically, all requests from users for access to the data are processed by the DBMS. For example, information may be added or removed from data files, information retrieved from or updated in such files, and so forth, all without user knowledge of underlying system implementation. In this manner, the DBMS provides users with a conceptual view of the database that is removed from the hardware level.
DBMS systems have long since moved from a centralized mainframe environment to a de-centralized or distributed environment. Today, one generally finds database systems implemented as one or more PC “client” systems, for instance, connected via a network to one or more server-based database systems (SQL database server). Commercial examples of these “client/server” systems include Powersoft® clients connected to one or more Sybase® Adaptive Server® Enterprise database servers. Both Powersoft® and Sybase® Adaptive Server® Enterprise (formerly Sybase® SQL Server®) are available from Sybase, Inc. of Dublin, Calif. The general construction and operation of database management systems, including “client/server” relational database systems, is well known in the art. See e.g., Date, C., “An Introduction to Database Systems, Volume I and II,” Addison Wesley, 1990; the disclosure of which is hereby incorporated by reference.
One purpose of a database system is to answer decision support queries. A query may be defined as a logical expression over the data and the data relationships set forth in the database, and results in identification of a subset of the database. Consider, for instance, the execution of a request for information from a relational DBMS. In operation, this request is typically issued by a client system as one or more Structured Query Language or “SQL” queries for retrieving particular data (e.g., a list of all employees earning $10,000) from database tables on a server. In response to this request, the database system typically returns the names of those employees earning $10,000, where “employees” is a table defined to include information about employees of a particular organization. The syntax of SQL is well documented, see e.g., “Information Technology—Database languages—SQL,” published by the American National Standards Institute as American National Standard ANSI/ISO/IEC 9075: 1992, the disclosure of which is hereby incorporated by reference.
SQL queries express what results are requested but do not state how the results should be obtained. In other words, the query itself does not tell how the query should be evaluated by the database management system. Rather, a component called the optimizer determines the “plan” or the best method of accessing the data to implement the SQL query. The query optimizer is responsible for transforming an SQL request into an access plan composed of specific implementations of the algebraic operator selection, projection, join, and so forth. Typically, this is done by generating many different access strategies, evaluating the cost of each, and selecting the access plan with the lowest overall cost, where “cost” is a metric that measures a combination of factors, including, but not limited to, the estimated amount of computational overhead, the number of physical Input/Output (“I/O”) operations, and the response time. However, producing an optimal access plan for any given SQL query is a complex problem.
In a modern optimizer, efficient handling of orderings is of great importance to query optimization. Indeed, ordering-based techniques remain unavoidable in modern query processing, despite the effectiveness of hash-based techniques, see e.g., G. Graefe, “The Value of Merge-Join and Hash-Join in SQL Server,” VLDB, 1999 and G. Graefe, R. Bunker, S. Cooper, “Hash Joins and Hash Teams in Microsoft SQL Server,” VLDB, 1998. Ordering is intimately related to two fundamental components of relational database management systems: B-tree indices, at the implementation level, and the ORDER BY clause, at the language level. A B-tree index, besides providing direct access to database records or tuples, also provides an ordering. An ORDER BY clause explicitly requires an ordering. The naive solution, lazy sorting, is in many cases sub-optimal.
Given that ordering is important, there are many relational implementation techniques that rely on ordering and would benefit from its advanced handling. One example of a relational technique that would benefit from improved handling of orderings is merge join, the join algorithm that relies on its arguments being ordered on the equi-join clause columns. Several other relational database algorithms rely on ordered input including: merge union distinct, group sorted, distinct ordered and the min/max scalar aggregates.
An ordering is usually either provided by an index or enforced using a sorting operation or “sort node.” An index being an alternative to the sort, the term “ordered input” is used instead of “sorted input” in this document. As the cost of a sorting operation is related to the amount of data to sort and as many relational algorithms preserve in their result some of the orderings provided by their arguments, sorting should be performed in the most efficient place that makes the ordering available up to the point where it is actually needed. In other words, a solution is required which provides for optimal placement of the sort node in an access plan. Because sorting is an expensive operation, optimizers typically preserve sub-plans that provide an “interesting ordering” even if there are other cheaper (in terms of query execution cost) sub-plans that do not provide an interesting ordering. “Interesting ordering” involves consideration of the ordering of intermediate results from operator to operator. In particular, retaining a sub-plan that has an interesting ordering may enable a sort operation to be avoided, thereby reducing overall query execution cost compared to another sub-plan that does not have an interesting ordering. However, preserving these sub-plans reduces pruning and results in more sub-plans staying in the competition to create the best total plan. This results in an increase in the search space.
Among the ordering-based algorithms, distinct and group ordered are very inexpensive (in terms of query execution cost) implementations (non-blocking, on-the-fly, relying on their input having an ordering on the grouping columns) of an otherwise expensive operator. The distinct logical relational operator, also called delta-project, removes the duplicate tuples from its input; it gr
Mizrahi Diane D.
Riddle G. Mack
Smart John A.
Sybase Inc.
Wu Yicun
LandOfFree
Database system providing methodology for property enforcement does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Database system providing methodology for property enforcement, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Database system providing methodology for property enforcement will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3278478