Patent
1989-10-05
1992-06-09
Shaw, Gareth D.
395650, G06F 1540
Patent
active
051214946
ABSTRACT:
A technique for performing joins in parallel on a multiple processor database system effectively deals with data skew. The join operation is performed in three stages with an optional fourth stage. The first stage is a preparatory stage, the detail of which depends on the underlying join algorithm used. This preparatory stage provides pre-processing the results of which are used in the following stage as the basis for defining subtasks for the final join operation. The data provided in the first stage is used in the second stage to both define subtasks and to optimally allocate these subtasks to different processors in such a manner that the processors are close to equally loaded in the final join operation, even in the presence of data skew. This second stage is an assignment stage the details of which depend on the underlying join algorithm. Once the second stage has completed its processing of the subtasks, the subtasks are shipped to their assigned processors for processing and the final join of the two relations in the third stage. The method used in the final join operation depends on the underlying join algorithm used. Optionally, during the actual join as performed in the third stage, there could be a dynamic re-assignment of the subtasks should the join operation become unbalanced.
REFERENCES:
C. Date, Addison-Wesley, An Introduction to Database Systems, vol. 1, 3rd Ed., 1982, pp. 209-210.
M. Blasgen & K. Eswaran, Storage and Access in Relational Databases, IBM Systems Journal, vol. 4, 1977, p. 363 et seq.
D. Bitton et al., Parallel Algorithms for the Execution of Relational Database Operations, ACM Trans. on Database Systems, vol. 8, No. 3, Sep. 1983, pp. 324-353.
P. Valduriez et al., Join and Semijoin Algorithms for a Multiprocessor Database Machine, ACM Trans. on Database Machines, V. 9, No. 1, Mar. 1984, pp. 133-161.
J. P. Richardson et al., Design and Evaluation of Parallel Pipelined Join Algorithms, ACM SIGMOD 1987, May 1987, pp. 160-169.
S. G. Akl et al., Optimal Parallel Merging and Sorting without Memory Conflicts, IEEE Trans. on Comp., vol. c-36, No. 11, Nov. 1987, pp. 1367-1369.
D. J. Dewitt et al., Multiprocessor Hash-based Join Algorithm, Proc. 11th VLDB, 1985.
M. S. Lakshmi et al., Effect of Skew on Join Performance in Parallel Architectures, Proc. Intl. Symposium on Databases in Parallel and Distributed Database Systems, 1988.
D. A. Schneider et al., A Performance Evaluation of Four Parallel Join Algorithms in a Shared-Nothing Multiprocessor Environment, Proc. ACM SIGMOD Conference, 1989.
R. C. Hu et al., Removing Skew Effect in Join Operation on Parallel Processors, Technical Report CSD-890027, UCLA, 1989.
Dias Daniel M.
Wolf Joel L.
Yu Philip S.
IBM Corporation
Kulik Paul
Shaw Gareth D.
LandOfFree
Joining two database relations on a common field in a parallel r does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Joining two database relations on a common field in a parallel r, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Joining two database relations on a common field in a parallel r will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-1812044