Non-blocking parallel band join algorithm

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S793000, C707S793000, C707S793000, C707S793000, C707S793000, C707S793000

Reexamination Certificate

active

06804678

ABSTRACT:

BACKGROUND
Relational databases are used for storage and retrieval of information. The information is structured in the database as two-dimensional tables of rows and columns. A column heading designates the type of data stored in each column. The information is stored in a non-volatile medium such as a disk array.
Users may access the database information typically by using database management software. The database storage media and management software together comprise a database management system, or DBMS. DBMSs may be implemented on a centralized mainframe system, or may be distributed in a client-server network, as examples.
The database management software includes specialized commands for accessing the database information. For example, a common command for accessing data is a Structured Query Language (SQL) “select” query. Using the select query, one or more rows from one or more tables of the database may be retrieved.
Traditionally, DBMSs processed queries in batch mode. In other words, a user wanting to extract information from the database would submit a query, wait a long time during which no feedback is provided, and then receive a precise answer.
Today, on-line aggregation and adaptive query processing present alternatives to traditional batch query processing. On-line aggregation permits progressively refined running aggregates of a query to be continuously displayed to the requesting user. The running aggregates, or intermediate results, are displayed typically along with a “confidence” factor. Adaptive query processing involves an iterative feedback process in which the DBMS receives information from its environment and uses the information to adapt the behavior of the query.
One area of optimization involves join operations. When queries involving multiple tables are made, a join operation may be performed. Upon receiving the multi-table query, tuples, or rows, from one table are joined with tuples from a second table, to produce a result. An equijoin is a type of join operation in which an entry, or column, of a tuple from one table has the same value as an entry of a tuple from a second table.
A band join is a non-equijoin of tuples of two tables in which the join condition is a range or band rather than an equality. Band joins may be useful in queries that involve real world domains, such as time, position, or price.
For example, suppose that a user of the DBMS wants to investigate the correlation between the situation of the stock market and important company events. Two tables, PRICE and NEWS, are involved. Tuples of PRICE represent the oscillation of stocks within a day, with attribute PRICE.C representing the time of the measurement in seconds. Tuples of NEWS represent financial news articles events, with attribute NEWS.D representing the time in seconds that the article was released.
Suppose the user wants to find all pairs of events occurring at nearly the same time, such that the first event represents a great oscillation of a stock within a day, and the second event represents a news event that mentions the company. Such a query may use a band join. The query may be written in SQL as:
SELECT PRICE.SYMBOL, NEWS.ARTICLE, PRICE.PERCENT_CHANGE
FROM PRICE, NEWS
WHERE PRICE.PERCENT_CHANGE>10
AND PRICE.C-NEWS.D<=300
AND PRICE.C-NEWS.D>=−300
AND NEWS.ARTICLE.CONTAINS (PRICE.SYMBOL)
One of the conditions uses a join operation, in which the difference between attribute, PRICE.C, and attribute, NEWS.D, is itself between −300 and 300. Tuples that meet the join criteria become part of the result table for the query.
There are two kinds of widely used traditional band join algorithms: the partitioned band join algorithm, which employs both a partitioning phase and a sorting phase, and the sort-merge band join algorithm, which employs a sorting phase and several merging phases. Both of these band join algorithms generate no results before the final phase. Thus, these types of band join algorithms are “blocking,” and, thus, are inappropriate for on-line aggregation and adaptive query processing. If, instead, users of the DBMS receive an approximation of the final results during processing, the query may, in some cases, be aborted, long before its completion.
SUMMARY
In accordance with the embodiments described herein, a method and apparatus are disclosed in which first tuples are stored in a first table in a database system, second tuples are stored in a second table in the database system, the first and second tuples are partitioned into plural portions, and the first and second tuples are joined based upon the partitioning portions.
In other embodiments, the selection of any first tuple to be joined with any second tuple is random. In still other embodiments, result tuples are available after performing only a single join operation.
Other features and embodiments will become apparent from the following description, from the drawings, and from the claims.


REFERENCES:
patent: 4930072 (1990-05-01), Agrawal et al.
patent: 5551031 (1996-08-01), Cheng et al.
patent: 5557791 (1996-09-01), Cheng et al.
patent: 5745896 (1998-04-01), Vijaykumar
patent: 5832475 (1998-11-01), Agrawal et al.
patent: 6032144 (2000-02-01), Srivastava et al.
patent: 6061676 (2000-05-01), Srivastava et al.
patent: 6081801 (2000-06-01), Cochrane et al.
patent: 6112198 (2000-08-01), Lohman et al.
patent: 6205441 (2001-03-01), Al-omari et al.
patent: 6226639 (2001-05-01), Lindsay et al.
patent: 6415297 (2002-07-01), Leymann et al.
patent: 6484159 (2002-11-01), Mumick et al.
patent: 6493701 (2002-12-01), Ponnekanti
patent: 6618719 (2003-09-01), Andrei
patent: 6625593 (2003-09-01), Leung et al.
patent: 2002/0103793 (2002-08-01), Koller et al.
R. Avnur et al., “Eddies: Continuously Adaptive Query Processing”, SIGMOD Conf. 2000, pp. 261-272.
D. Bitton et al., “Benchmarking Database Systems a Systematic Approach”, VLDB 1983, pp. 8-19.
J. Chen et al., “NiagaraCQ: A Scalable Continuous Query System for Internet Databases”, SIGMOD Conf. 2000, pp. 379-390.
D. DeWitt et al., “A Performance Analysis of the Gamma Database Machine”, SIGMOD Conf. 1988, pp. 350-360.
D. DeWitt et al., “Client-Server Paradise”, VLDB 1994, pp. 558-569.
D. DeWitt et al., “An Evaluation of Non-Equijoin Algorithms”, VLDB 1991, pp. 443-452.
G. Graefe, “Query Evaluation Techniques for Large Databases”, ACM Comput. Surveys, 25(2):73-170, Jun. 1993.
P. Haas et al., “Join Algorithms for Online Aggregation”, IBM Research Report RJ10126, 1998.
P. Haas, “Techniques for Online Exploration of Large Object-Relational Datasets”, Proc. 11th Intl. Conf. Scientific and Statistical Database Management, 1999, pp. 4-12.
J. Hellerstein et al., “Interactive Data Analysis: The Control Project”, IEEE Computer, 32, Aug. 1999, pp. 51-59.
J. Hellerstein et al., “Informix under CONTROL: Online Query Processing”, Data Mining and Knowledge Discovery, 2000.
J. Hellerstein, “Online Processing Redux”, IEEE Data Engineering Bulletin, Sep. 1997.
J. Hellerstein, “Looking Foward to Interactive Queries”, Database Programming and Design, Aug. 1998.
J. Hellerstein et al., “Adaptive Query Processing: Technology in Evolution”, IEEE Data Engineering Bulletin, Jun. 2000.
P. Haas et al., “Ripple Joins for Online Aggregation”, SIGMOD Conf. 1999, pp. 287-298.
J. Hellerstein et al., “Online Aggregation”, SIGMOD Conf. 1997, pp. 171-182.
Helmer et al., “Evaluation of Main Memory Join Algorithms for Joins with Set Comparison Join Predicates”, VLDB 1997, pp. 386-395.
Z. Ives et al., “An Adaptive Query Execution System for Data Integration”, SIGMOD Conf. 1999, pp. 299-310.
J. Naughton et al., “The Niagara Internet Query System”, submitted for publication, 2000.
F. Olken, “Random Sampling from Databases”, Ph.D. Dissertation, UC Berkeley, Apr. 1993, available as Tech. Report LBL-32883, Lawrence Berkeley Laboratories.
H. Pang et al., “Partially Preemptible Hash Joins”, SIGMOD Conf. 1993, pp. 59-68.
H. Pang et al., “Memory-Adaptive External Sorting ”, VLDB 1993, pp. 618-629.
V. Raman et al., “Online Dynamic Reordering for Interactive Data Processing”

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Non-blocking parallel band join algorithm does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Non-blocking parallel band join algorithm, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Non-blocking parallel band join algorithm will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3315571

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.