System and method for segmented evaluation of database queries

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S793000

Reexamination Certificate

active

06748392

ABSTRACT:

FIELD OF THE INVENTION
The present invention relates generally to the field of database systems. More particularly, the invention relates to a system and method for evaluating certain types of database queries on a per-segment basis, and for identifying those queries that are candidates for per-segment evaluation.
BACKGROUND OF THE INVENTION
Database systems store, retrieve, and process information. In order to retrieve information from the database, a user provides a query (written in a query language such as SQL), where the query specifies the information to be retrieved and the manner in which it is to be manipulated or evaluated in order to provide a useful result. To process the query, the database system may convert the query into a relational expression that describes algebraically the result specified by the query. The relational expression is then used to produce an execution plan, which describes particular steps to be taken by a computer in order to produce the sought result.
When a relational expression is produced from a query, it may be the case that certain terms in the expression are redundant. For example, the operands of a join operator may be two instances of the same table, T. If T is a table that is stored in the database, then a straightforward evaluation of the join requires redundant accesses to the same (possibly large) table T during execution. Worse yet, T may not be a stored table, but rather may a table that is computed from a complex relational sub-expression. In this case, straightforward evaluation of the expression may require T to be derived twice from the same (possibly complicated) expression at runtime.
A conventional way to address this problem is to identify and evaluate common sub-expressions, spool (i.e., “buffer”) the entirety of the sub-expression result, and use the spooled result whenever the common sub-expressions are used. However, the sub-expression result may be relatively large, in which case some of the benefits of spooling will be lost. If a sub-expression result is larger than the available memory, then spooling the result may cause it to be paged to disk, which may be just as costly of resources as computing the result of the sub-expression twice.
In view of the foregoing, there is a need for a query evaluation system that overcomes the drawbacks of the prior art.
SUMMARY OF THE INVENTION
The present invention provides a system and method for efficient query evaluation. A technique is provided for identifying joins in relational expressions that can be performed on a per-segment basis. In accordance with the invention, joins are identified whose operands are different instances of a common sub-expression, optionally modified by an aggregate or a filter. Each segment of the common subexpression is spooled, and the join is performed successively on each of the segments. Because the segments are likely to be relatively small compared to the entire sub-expression, these segments may fit in memory in situations where the entire sub-expression result does not. Thus, unnecessary spooling of an entire sub-expression result (and the consequent memory swapping) is avoided.
Joins that may be evaluated on a per-segment basis are identified by searching for joins that meet the following criteria: First, the two operands of the join must be different instances of the same relation. Optionally, each instance of the relation may be modified by an aggregate and/or a filter. Second, the join predicate must be, or conjunctively include, an equality comparison between the same column in different instances of the relation. If the join predicate contains such a comparison, then rows of the first instance of the relation will never join with rows of the second instance of the relation that have different values in the equality-compared columns. Thus, the relation can be “segmented” into groups of rows having common values in the columns that are compared for equality in the join predicate, and the join may be separately applied to each of the groups.
The invention provides a relational operator called “GbApply,” which specifies per-segment evaluation of a relational expression. GbApply takes a relation as its input, segments the relation according to a set of columns, and applies a relational fragment to successive segments of the relation. A join meeting the conditions described above may be rewritten using a GbApply operator. The relation that is common to both sides of the join is used as the input to the GbApply operator, and the columns that are compared for equality in the join predicate are specified as the segmenting columns. The join expression is then rewritten so that the operands and the predicate refer to instances of the segment rather than instances of the entire relation; the rewritten join expression is the “relational fragment” used by the GbApply operator. The GbApply operator may be used as part of the expression tree that represents a relational expression. Expressions trees including the GbApply operator may be “reordered” if certain conditions are met; reordering the order of evaluating an expression may result in a more efficient evaluation of the expression.
Execution iterators are provided which may be used to perform a GbApply operation. The “SegSpool” iterator receives a sorted relation as input and spools a segment of the relation. Preferably, SegSpool performs the segmentation by spooling successive rows of the sorted relation until a row is encountered whose values in the segmenting columns differ from the last row. The “SegApply” iterator applies the relational fragment associated with a GbApply operator to the spool created by SegSpool. Application of the relational fragment to the spool is repeated until the relational fragment is unable to produce additional result rows based on the spooled segment. SegApply then calls SegSpool to spool the next segment.
According to a feature of the invention, SegSpool and SegApply may be used to perform a major-minor sort, or to compute the aggregates “min” and “max.” When a major-minor sort is performed, SegSpool is used to segment the table according to the “major” columns; SegApply then applies a sorting operation (on the “minor” columns) to each of the segments. In order to compute the aggregates “min” and “max” (e.g., the minimum value in column A for each group of rows grouped by column B), the table is sorted on columns B and A. SegSpool is then used to segment the table according to column B, and SegApply is used to identify the first row in each segment (for a “max” calculation, the sort on column A is performed in descending order, so that the first row in each group will have the highest column A value for that group).
Other features of the invention are described below.


REFERENCES:
patent: 5701454 (1997-12-01), Bhargava et al.
patent: 5855012 (1998-12-01), Bhargava et al.
patent: 6275918 (2001-08-01), Burky et al.
patent: 6411951 (2002-06-01), Galindo-Legaria et al.
patent: 6618719 (2003-09-01), Andrei
Claussen, J. Kemper, A. Moerkotte, G. Peithner, K. Steinbrunn, “Optimization of evaluation of disjunctive” Knowledge and Data Engineering, IEEE Transactions on Passau Univ.,Germany; Publication Date: Mar. Apr. 2000 ; page(s): 238-260.*
Chatziantoniou Damianos, et al., “Groupwise Processing of Relational Queries,”Proceedings of the 23rdVLDB Conference, 1997, 476-485.
Roy, Prasan, et al., “Efficient and Extensible Algorithms for Multi Query Optimization,”Technical Report, Indian Institute of Technology, Bombay, 1998, 249-260.
Zhang, Weiye, et al., “Speeding Up Heterogeneous Data Access by Coverting and Pushing Down String Comparisons,”15thInternational Conference on Data Engineering, 1999, 261.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

System and method for segmented evaluation of database queries does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with System and method for segmented evaluation of database queries, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System and method for segmented evaluation of database queries will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3328501

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.