Data processing: database and file management or data structures – Database design – Data structure types
Reexamination Certificate
2000-03-20
2002-09-24
Mizrahi, Diane D. (Department: 2175)
Data processing: database and file management or data structures
Database design
Data structure types
C707S793000
Reexamination Certificate
active
06457020
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates in general to computer-implemented database systems, and, in particular, to query optimization using a multi-layered object cache.
2. Description of Related Art
Databases are computerized information storage and retrieval systems. A Relational Database Management System (RDBMS) is a database management system (DBMS) which uses relational techniques for storing and retrieving data. Relational databases are organized into tables which consist of rows and columns of data. The rows are formally called tuples or records. A database will typically have many tables and each table will typically have multiple tuples and multiple columns. The tables are typically stored on direct access storage devices (DASD), such as magnetic or optical disk drives for semi-permanent storage.
The integration of object technology and database systems has been an active area of research for the past decade. One important aspect of the integration of these two technologies is the provision of efficient, declarative query interfaces for accessing and manipulating object data. Compared to other aspects of object-oriented database (“OODB”) technology, such as integrating persistence into object-oriented languages (e.g., C++ and Smalltalk), queries were given relatively little attention in the early days of OODB research.
A number of proposals for OODB query languages have appeared in the database literature. Query rewrite transformations have been developed for relational DBMSs. Many of these transformations also apply for Object Query Systems. However, new query rewrite transformations that apply specifically to Object Query Systems need to be developed. Predicate pushdown, which is a query rewrite transformation, is the notion of taking a query and determining which parts of the query can be migrated through the layers of the schema to the databases where the data resides. The objective is to use the power of the database query function to do data filtering, and, thereby, restrict the amounts of data that have to be transferred from the database servers to clients.
Predicate pushdown can include all of the predicates that define a query's result, in which case the task of restricting the result set is entirely performed by the databases where the data resides. Predicate pushdown can include partial predicates that define a query's results, in which case some of the predicates (e.g., a subset of the conjuncts that define a query's result) are passed down to the databases where the data resides, thereby restricting the results returned by these databases. The remaining predicates that could not be pushed down are then applied in object space by the query evaluator. Finally, if redicate pushdown cannot be applied, the predicates that define a query's results must be applied in object space after having retrieved the complete sets of data referenced in the query.
Query evaluation using a client cache is presented in Shaul Dar, Michael J. Franklin, Björn T. Jónsson, Divesh Srivastava, and Michael Tan, Semantic Data Caching and Replacement, Proc. 22nd International Conference on Very Large Data Bases, Mumbai, August 1996, [hereinafter “Dar et al.”]. Dar et al. focuses on determining whether a query can be resolved from the client cache alone or whether a partial query result can be obtained from the client cache with the remaining result drawn from the database server. The technique that is used for query evaluation is predicated on maintaining a semantic description of the client cache content. For a given table, the semantic description is a constraint that is dynamically modified to include new cache entries. For example, if a query initially retrieves all employees having a salary between 50,000 and 100,000, the constraint describing the cache content for the employee table is sal≧50000 and sal≦100000. If a subsequent query requests employees having a salary between 60,000 and 80,000, that query result can be drawn from the cache alone. A similar approach called predicate-based caching is presented in Arthur M. Keller and Julie Basu, A Predicate-Based Caching Scheme for Client-Server Database Architectures, The VLDB Journal, 5:35-47, 1996.
The ORION system is described in the paper entitled Architecture of the ORION Next-Generation Database System, by W. Kim, J. Garza, N. Ballou and D. Woelk, published in IEEE Transactions on Knowledge and Data Engineering, 2(1), in March 1990. In ORION, deferred updates result from altering the schema of a class, as, for example, when an attribute of a class is dropped from its schema. The task of updating all records of the class to remove the attribute from each instance is deferred until some later time. This allows schema updates to proceed quickly. ORION is an OODBMS which implements a client-server organization with support for queries that return handles on objects of a single class as query results. However, due to this restriction, ORION queries are limited to semi-joins, which are expressed as path queries (referred to as “nested class queries”). As an optimization, ORION queries can run both on the server and on the client computer. In that system updates to client resident objects which have not been propagated to the server can be reflected in query results using either a single buffer or a dual buffer evaluation scheme. Using the single buffer scheme, updates in the client cache are flushed to the server and the query is evaluated against the server.
However, this scheme is incompatible with optimistic locking, since updates can only be flushed at commit time. The dual buffer evaluation scheme runs a query against “dirty” objects in the client cache and against objects in the server. The two result sets are merged; result objects in the server's result set, which are also present in the client's result set, do not participate in the final result. This scheme is incompatible with optimistic locking. In this scheme, for example, there could have been an object O read into the cache by an earlier query within a single ongoing transaction. The object has since been updated in the database by a separate unrelated transaction. While the cache copy is “clean” and qualifies the query's result, the database copy no longer qualifies the query's result and is not returned by the server query. Since O is not “dirty”, it is not selected by the client compensating query.
In the ORION system, the dual buffer evaluation scheme is also presented for path queries. The scheme involves pushing partial updates back to the server, which is incompatible with optimistic locking. Furthermore, this technique has problems with respect to duplicate semantics, if extended to full join queries. While there are buffering differences, objects are homogeneous in both the client and the server. Moreover, this system is not concerned with creating an object-oriented representation of data from heterogeneous sources, and does not support extended SQL-92 queries which can return values, as well as objects, and, aside from paths, include operations which are joins, grouping and aggregation, and union.
The Garlic system is described in the paper entitled Loading a Cache With Query Results, by Laura M. Haas, Donald Kossman and Ioana Ursu, published in Proc. 25
th
International Conference on Very Large Data Bases, pages 351-352, Edinburgh, in September 1999. The Garlic database middleware system supports object queries that return handles to objects as query results and allows the application of methods in queries over collections of objects. The Garlic system can return handles on objects as query results without retrieving the data associated with these objects. Retrieval is performed by a separate key-based query, issued by the application when attributes of the object are accessed. Therefore, the Garlic system allows building objects strictly by retrieving data for the key attributes and, alternatively, retrieving data for all of the object's attri
Carey Michael James
Kiernan Gerald George
Mizrahi Diane D.
Mofiz Apu M
Parker Esq. Sandra M.
LandOfFree
Query optimization using a multi-layered object cache does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Query optimization using a multi-layered object cache, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Query optimization using a multi-layered object cache will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2879249