Database system providing methodology for acceleration of...

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Reexamination Certificate

active

06823329

ABSTRACT:

COPYRIGHT NOTICE
A portion of the disclosure of this patent document contains material that is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.
COMPUTER PROGRAM LISTING APPENDIX
A Computer Program Listing Appendix, containing one (1) total file on compact disc, is included with this application.
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates generally to data processing environments and, more particularly, to system and methods for improved optimization and execution of queries involving functional expressions against database columns having enumerated storage.
2. Description of the Background Art
Computers are very powerful tools for storing and providing access to vast amounts of information. Computer databases are a common mechanism for storing information on computer systems while providing easy access to users. A typical database is an organized collection of related information stored as “records” having “fields” of information. As an example, a database of employees may have a record for each employee where each record contains fields designating specifics about the employee, such as name, home address, salary, and the like.
One purpose of a database system is to answer decision support queries and support transactions. A query may be defined as a logical expression over the data and the data relationships set forth in the database, and results in identification of a subset of the database. For example, a typical query might be a request, in SQL syntax, for data values corresponding to all customers having account balances above required limits. During query processing, a database system typically utilizes one or more indexes to answer queries. Indexes are organized structures associated with the data to speed up access to particular data values (i.e., answer values). Indexes are usually stored in the database and are accessible to a database administrator as well as end users. The basic operation of database systems, including the syntax of SQL (Structured Query Language), is well documented in the technical, trade, and patent literature; see, e.g., Date, C.,
An Introduction to Database Systems
, Volume I and II, Addison Wesley, 1990; the disclosure of which is hereby incorporated by reference.
“Data warehouse” systems represent a type of database system optimized for supporting management decision making by tracking and processing large amounts of aggregate database information—that is, the data warehouse. Data warehouses contain a wide variety of data that present a coherent picture of business conditions at a single point in time. Development of a data warehouse includes development of systems to extract data from operating systems plus installation of a warehouse database system that provides managers flexible access to the data. A well-known example of a data warehouse system is Sybase® Adaptive Server® IQ (ASIQ), available from Sybase, Inc. of Dublin, Calif.
A data warehousing system typically handles large pools of historical information representing different portions of a business. These different portions may, for example, represent a number of different geographical areas, such as information relating to sales to customers located in Texas, Georgia, Massachusetts, New York, and Maine. The information may also represent different periods of time, such as sales in particular calendar months or quarters. A number of different products and/or business units may also be involved. Database systems frequently store very large quantities of information and for this reason database administrators continuously look for ways to more efficiently store and retrieve information in database systems.
One well-known technique for more efficiently storing information in database systems is utilizing enumerated storage. Enumerated storage utilizes a dictionary look-up style scheme in which the dictionary (or look-up table) contains a list of distinct values, with each distinct value associated with a particular offset which serves as an address for locating that value. Instead of redundantly storing the same information, such as the name of a particular state (e.g., New York) multiple times in a column of the database table, the look-up table contains values that will be used as the contents of the column. The column in the main database table may then store only the offset. However, since there is a relationship between the look-up table and the main table, the value associated with offset in the look-up table is also associated with the record in the main table. For example, a database column T.STATE_NAME with enumerated storage may have a look-up table such as the following:
offset
1
|
Texas
|
2
|
Georgia
|
3
|
Massachusetts
|
4
|
New York
|
5
|
Maine
|
In a situation where the number of distinct values in a particular column of a database is modest (e.g., 50 states) compared to the number of rows in the table, this enumerated storage scheme is considerably more space efficient than repeatedly storing the same raw data values in the database column, and thereby requires fewer input/output (I/O) operations to read or write.
One problem in handling many database queries is the evaluation of functional expression on columns. Considerable effort is spent in database query engines to identify the point in a query plan where a functional expression should be evaluated in order to minimize the estimated number of evaluations of the expression. If a predicate (or condition) includes complex or simple expressions over column(s) that cannot be answered from an index, then formulating an answer to the predicate involves the expensive (in terms of system performance) task of scanning all of the raw data values of the column, row by row, in order to respond to the query. For instance, a column may contain state names. A user could use an expression such as the following to obtain all state names beginning with the letter “S”:
SELECT T.CUSTOMER FROM T
WHERE T.STATE_NAME LIKE ‘S%’
Typically, a traditional B-Tree index would enable a user or administrator to easily select all states beginning with a particular letter, because the first letter is the leading portion of the string. However, if a user wanted to find the second letter of a state name, this could not be handled using the index as this string is in the middle of the data. As a result there is no traversal path in the index that will lead to the answer. Thus, a table scan (or column scan) would typically be required to handle this type of query expression in current database systems.
Consider, for example, the following Structured Query Language (SQL) statement:
SELECT T.CUSTOMER FROM T
WHERE SUBSTRING(T.STATE_NAME, 2, 1)=‘e’
This type of expression could not be resolved from a typical B-Tree index. It also is problematic to resolve against a column with enumerated storage in current database systems, as it requires a column scan of the raw data in order to evaluate the query. If one has a large column with billions of rows, this is an expensive task to undertake, as it could easily require the evaluation of the expression billions of times.
If a database administrator knows in advance that a particular query using a particular functional expression will be used, one possible way to address this problem is by using the concept of “materialized views.” This approach involves executing a particular query in advance and storing the results in an index or data structure. Materialized views may be useful when an administrator knows in advance that a particular query expression containing specific functional expressions will be used. When a query optimizer subsequently receives that particular query (e.g., a specific query expression containing the exact same set of substitution arguments

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Database system providing methodology for acceleration of... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Database system providing methodology for acceleration of..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Database system providing methodology for acceleration of... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3334217

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.