System and methodology for join enumeration in a...

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S793000

Reexamination Certificate

active

06516310

ABSTRACT:

COPYRIGHT NOTICE
A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.
BACKGROUND OF THE INVENTION
The present invention relates generally to information processing environments and, more particularly, to access and processing of information in a data processing system embodied, at least in part, in portable devices.
Computers are very powerful tools for storing and providing access to vast amounts of information. Computer databases are a common mechanism for storing information on computer systems while providing easy access to users. A typical database is an organized collection of related information stored as “records” having “fields” of information. As an example, a database of employees may have a record for each employee where each record contains fields designating specifics about the employee, such as name, home address, salary, and the like.
Between the actual physical database itself (i.e., the data actually stored on a storage device) and the users of the system, a database management system or DBMS is typically provided as a software cushion or layer. In essence, the DBMS shields the database user from knowing or even caring about underlying hardware-level details. Typically, all requests from users for access to the data are processed by the DBMS. For example, information may be added or removed from data files, information retrieved from or updated in such files, and so forth, all without user knowledge of underlying system implementation. In this manner, the DBMS provides users with a conceptual view of the database that is removed from the hardware level.
The general construction and operation of a database management system is known in the art. See e.g., Date, C.,
An Introduction to Database Systems,
Volume I and II, Addison Wesley, 1990; the disclosure of which is hereby incorporated by reference.
DBMS systems have long since moved from a centralized mainframe environment to a de-centralized or distributed environment. One or more PC “client” systems, for instance, may be connected via a network to one or more server-based database systems (SQL database server), to form a client/server database system. In operation, clients issue one or more query language (e.g., SQL) commands to the server. A query language is a specialized language for accessing or processing information from a database. SQL commands may, for instance, specify a query for retrieving particular data (i.e., data records meeting the query condition) from a database table. The syntax of SQL (Structured Query Language) is well documented; see, e.g., the abovementioned
An Introduction to Database Systems.
As used herein, “SQL” shall also include vendor-specific variants of SQL, such as Sybase® Transact-SQL. In addition to retrieving the data from database server tables, the clients also include the ability to insert new rows of data records into the table; clients can also modify and/or delete existing records in the table. Commercial examples of these “client/server” systems include Powersoft™ clients connected to one or more Sybase Adaptive Server™ database servers. Both Powersoft™ and Sybase Adaptive Server™ (formerly Sybase SQL Server™) are available from Sybase, Inc. of Emeryville, Calif.
In today's computing environment, database technology can be found on virtually any device, from traditional mainframe computers to cellular phones. Sophisticated applications, whether enterprise information portals or sales force automation systems, can “push” much of their complexity into the database itself. Indeed, this represents one of the main benefits of database technology. The challenge, however, is to support these complex applications, and the queries they generate, on small computing devices. At the same time, users expect the productivity and reliability advantages of using an SQL database, while maintaining the size and performance advantages of hand-coded applications.
Consider, for instance, the execution of an SQL request or query. A query “optimizer” in a relational DBMS is responsible for transforming an SQL request into an access plan composed of specific implementations of the algebraic operators selection, projection, join, and so on. Typically, this is done by generating many different join strategies, evaluating the cost of each, and selecting the access plan with the lowest overall cost, where “cost” is a metric that measures a combination of factors, including but not limited to the estimated amount of computational overhead, number of physical I/O operations, and response time. The process of generating these alternative join strategies is termed “join enumeration.” However, producing an optimal access plan for an arbitrary SQL query is an NP-complete problem (see, e.g., Ibaraki, T. and Kameda, T., “On the optimal nesting order for computing n-relational joins”, ACM Transactions on Database Systems, 9(3): 482-502, September 1984; Ono, K. and Lohman, G. M., “Measuring the complexity of join enumeration in query optimization”, Proceedings of the 16th International Conference on Very Large Data Bases, pp. 314-325, Brisbane, Australia, August 1990, Morgan Kaufmann; Ozsu, M. T. and Valdariez, P., “Principles of Distributed Database Systems”, Prentice-Hall, Englewood Cliffs, New Jersey, 1991; Steinbrunn, M., et. al., “Heuristic and randomized optimization for the join ordering problem”, The VLDB Journal, 6(3): 191-208, August 1997), to discover an optimal strategy requires an exhaustive search. Consequently, optimizers often use heuristics (see, e.g., Ono, K. and Lohman, G. M. above; Steinbrunn, M., et. al. above; Ullman, J. D., “Principles of Database and Knowledge-Base Systems, Volume 2”, Computer Science Press, Rockville, Md., 1989) to reduce the number of strategies that the plan selection phase must consider.
A common heuristic used in most commercial optimizers is to restrict the strategy space to those that perform unary operations (particularly restriction) first, thus reducing the size of intermediate results. See, e.g., Smith, J. M. and Chang, P. Y.-T., “Optimizing the performance of a relational algebra database interface”, Communications of the ACM, 18(10): 568-579, October 1975; Ullman, J. D. above. Another common optimization heuristic, and one used by IBM's STARBURST, is to defer the evaluation of any Cartesian products to as late in the strategy as possible. See, e.g., Morishita, S., “Avoiding Cartesian products for multiple joins”, Journal of the ACM, 44(1): 57-85, January 1997; Ono, K. and Lohman, G. M. above. To further reduce the number of alternative plans, an optimizer may consider only left-deep processing trees. See, e.g., Cluet, S. and Moerkotte, G., “On the complexity of generating optimal left-deep processing trees with cross products”, Proceedings of the Fifth International Conference on Database Theory—ICDT 1995, pp. 54-67, Prague, Czech Republic, January 1995, Springer-Verlag; Ibaraki, T. and Kameda, T. above; Selinger, P. G., et. al., “Access path selection in a relational database management system”, ACM SIGMOD International Conference on Management of Data, pp. 23-34, Boston, Mass., May 1979. For SPJ queries a left-deep processing tree is one where the right child of any join must be a base table. For more complex queries, a left-deep tree means that the right child of any binary operator cannot be a join, though it could be the (possibly materialized) result of a view or table expression containing Union, Group by, or aggregation. Left-deep trees are desirable because (1) they reduce the need to materialize intermediate results, (2) for several types of join implementations they result in more efficient execution plans, and (3) the space of “bushy” plans is considerably larger, and hence more expensive to search. See, e.g., Vance,

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

System and methodology for join enumeration in a... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with System and methodology for join enumeration in a..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System and methodology for join enumeration in a... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3146222

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.