Apparatus and method for decomposing database queries for...

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S793000, C707S793000, C707S793000

Reexamination Certificate

active

06289334

ABSTRACT:

REFERENCE TO APPENDICES
The disclosure of this patent document contains material which is subject to copyright protection. The owner thereof has no objection to facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the U.S. Patent and Trademark Office patent file or records, but otherwise reserves all copyright whatsoever.
BACKGROUND OF THE INVENTION
This invention relates to digital data processing and, more particularly, to methods and apparatus for database management systems on multiprocessor digital data processing systems.
In addition to performing calculations, computers have traditionally been used to store and retrieve large amounts of data. Early computer systems were typically programmed for this on an ad hoc basis. For example, to track a company's employees, a program was typically written to handle all steps necessary to input, sort and store employee data in a computer file and, as necessary, to retrieve and collate it to generate reports. Special-purpose software packages, referred to as database management systems (or “DBMS's”), were later developed to handle all but the highest-level of these tasks.
Among the most widely used database management systems are the so-called relational systems. From an operator's perspective, these store data in two-dimensional tables. For example, each row (or record) of an employee data table might include the following columns (or fields) of information: name of an employee, his or her identification number, address, and department number.
.
.
.
.
.
.
.
.
.
.
.
.
Smith
1056
5 Oak Avenue
10
James
1058
3 State Street
41
Wright
1059
15 Main Street
25
.
.
.
.
.
.
.
.
.
.
.
.
One or more indexes on large tables are generally provided to facilitate the most common data accesses, e.g., look-ups based on employee name.
In relational systems, corresponding rows in two or more tables are identified by matching data values in one or more columns. For example, the department name corresponding to a given employee may be identified by matching his or her department number to a row in a department data table that gives department numbers and department names. This is in contrast to hierarchical, network, and other DBMS's that use pointers instead of data values to indicate corresponding rows when tables are combined, or “joined.”
Relational DBMS's typically permit the operator to access information in the database via a query. This is a command that specifies which data fields (columns) are to be retrieved from a database table and which records (rows) those fields are to be selected from. For example, a query for the names of all employees in department 10 might be fashioned as follows:
SELECT name, department_number
FROM employee
WHERE department_number = 10
There is no particular ordering of the resulting rows retrieved by the DBMS, unless the query specifies an ordering (e.g., ORDER BY name).
A query may also involve multiple tables. For example, to retrieve department names instead of numbers, the above query might be refashioned as follows:
SELECT name, department_name
FROM employee, department
WHERE department_number = 10
AND employee.department_number=department.department_number
A particular relational data table need not be stored in a single computer file but, rather, can be partitioned among many files. This makes such tables particularly suited for use on multiprocessor computer systems, i.e., computer systems having multiple processors and multiple disk drives (or other storage devices) of the type disclosed in U.S. Pat. No. 5,055,999. Unfortunately, prior art DBMS's have not proven capable of taking full advantage of the power of such multiprocessing systems and, particularly, their power to simultaneously process data (in parallel) from multiple partitions on multiple storage devices with multiple central processing units.
In view of the foregoing, an object of the invention is to provide improved methods and apparatus for database management and, particularly, improved methods and apparatus for data base management capable of operating on multiprocessor systems.
A further object of the invention is to provide improved systems for database management capable of effectively accessing a relational database contained in multiple tables and multiple partitions.
A still further object is to provide improved methods and apparatus for storing and retrieving data for access by a DBMS.
These and other objects are evident in the attached drawings and the description which follows.
SUMMARY OF THE INVENTION
The foregoing and other objects are attained by the invention which provides, in one aspect, improvements to digital data processors of the type having a database management system (DBMS) that accesses data records stored in a database table contained among plural independently accessible partitions (e.g., data partitions contained on separate disk drives), where that DBMS has a standard interface for processing queries to access those data records.
The improvement is characterized by a parallel interface that intercepts selected queries prior to substantive processing by the standard interface. The standard interface is often called the “server” interface; it is accessed by clients that are the source of queries. A decomposition element within the parallel interface generates multiple subqueries from the intercepted query. Those subqueries, each representing a request for access to data stored in a respective partition of the table, are applied in parallel to the standard interface in lieu of the intercepted query. Responses by the DBMS to the subqueries are reassembled to generate a final response representing the response the DBMS would have generated to the intercepted query signal itself. Such reassembly can include interleaving the data contained in the responses (e.g., to create a single sorted list) or applying an aggregate function (e.g., sum or average) to that data.
According to a further aspect of the invention, the decomposition element generates the subqueries to be substantially identical to the intercepted signal but including an “intersecting predicate” (i.e., additional query conditions) that evaluates true for all data records in respective partitions of said database table and false for all others. This can be, for example, a logically AND'ed condition that evaluates true for records in the respective partition. Continuing the first example above, assuming that the employee database is partitioned randomly across multiple partitions, a subquery for the first partition could be generated as follows (where rowid has three parts, the last of which indicates the partition number):
SELECT name, department_number
FROM employee
WHERE department_number = 10 AND
employee.rowid>=0.01 AND
employee.rowid<0.0.2
In another aspect, the invention contemplates a further improvement to a digital data processing system of the type described above, wherein the DBMS responds to selected queries for accessing data recordsjoined from one or more of database tables, and wherein the DBMS includes an optimizer for determining an optimal strategy for applying such queries to the tables. The improvement of this aspect is characterized by an element for identifying, from output of the optimizer, a driving table whose partitions will be targeted by subqueries generated in responding to an intercepted query. The improvement is further characterized by generating the subqueries to include, in addition to the predicate list of the intercepted query, an intersecting predicate for all data records in respective partitions of the driving database table. Those skilled in the art will appreciate that tables referenced in the query other than the driving table need not be identically partitioned to the driving table, nor co-located with its partitions on storage devices. Tables may be accessed through either full-table scans or indexed scans, i.e., whether the DBMS searches all blocks of the relevant partition or only those indicate

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Apparatus and method for decomposing database queries for... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Apparatus and method for decomposing database queries for..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Apparatus and method for decomposing database queries for... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2477304

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.