Run formation in large scale sorting using batched...

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S793000, C707S793000, C707S793000

Reexamination Certificate

active

06571244

ABSTRACT:

TECHNICAL FIELD
This invention relates to systems and methods for large scale sorting of data records. More particularly, this invention relates to run formation techniques for producing sorted subsets or “runs” of data records that will be subsequently merged together to produce the final sorted set of data records.
BACKGROUND
Database systems store data records on disk arrays or other forms of non-volatile memory. Operations performed on databases can be easily accommodated if all of the data records are first loaded into volatile main memory (i.e., RAM) for processing. However, databases often contain large numbers of data records, which far surpass the volatile memory resources. As a result, subsets of data records are processed in the main memory and then written back to disk after they are processed.
Large scale sorting is one type of operation performed by a database system on a large set of data records that is too large to fit in main memory. “Mergesort” is the standard technique used for sorting large sets of records. Mergesort has two phases: (1) a run formation phase that creates sorted subsets, called “runs”, and (2) a merge phase that repeatedly merges runs into larger and larger runs, until a single run has been created. This invention pertains to the run formation phase of a mergesort operation.
In general, there are two types of run formation techniques: load-sort-store and replacement selection. Most sort implementations use the load-sort-store algorithm for run formation. This invention is directed to an improved replacement selection technique. Load-sort-store is briefly described below, followed by a more detailed look at replacement selection.
Run Formation Type I: Load-Sort-Store
FIG. 1
shows the load-sort-store process implemented on a database system
20
having main memory
22
and persistent data storage
24
. As the name implies, the load-sort-store process has three main steps: loading, sorting, and storing. During a “load” step
1
, a subset of records is read from the data storage
24
into the main memory
22
. Next, during a “sort” step
2
, the process extracts pointers to all records into an array and sorts the entries in the array according to a designated sort key to form the run. Any in-memory sorting algorithm can be used, with “Quicksort” being the most popular choice. Afterwards, during a “store” step
3
, the run is written back to the data storage
24
and all records in main memory
22
are erased.
These three steps are repeated until all records have been processed. All runs created will be of the same length, except possibly the last run. After multiple runs have been created and stored during the run formation phase, the sorting process transitions to the merge phase in which the runs are merged into one large run.
One drawback with the load-sort-store process is that CPU processing and I/O (input/output) cannot be overlapped. As a result, the computing resources of the database system are not fully utilized.
Run Formation Type II: Replacement Selection
Replacement selection is an alternative technique that produces longer runs and completely overlaps CPU processing and I/O. It is based on the observation that by tracking the highest key output thus far, the process can easily determine whether an incoming record can still be made part of the current run or should be deferred to the next run. Any record tacked on to the current run in this way increases the run length. Adding a record to the current run is expensive if a (sorted) pointer array is used, so replacement selection often uses a (binary) selection tree instead.
FIG. 2
shows a replacement selection process implemented in the database system
20
. The process described below is the traditional version of the replacement selection algorithm as described, for example, by Donald Knuth in “
The Art of Computer Programming, Volume
. 3
: Sorting and Searching
”, Second Edition 1998. This version of the algorithm assumes that records are of fixed length but, in practice, variable-length records are more common.
Initially, a subset of records is read into the main memory
22
. When records are of fixed length, the main loop of the algorithm has three steps: selection, output, and get new record. In the “selection” step
1
, the process selects from among the records in memory, a record with the lowest key greater than or equal to the last key output. Then, at the “output” step
2
, the selected record is written out to the data storage
24
and the key value is remembered. Afterwards, at the “get” step
3
, the process retrieves a new record from the data storage
24
and stores it in the slot previously occupied by the record just output.
While load-sort-store is more widely used, replacement selection has many benefits. First, replacement selection produces runs larger than the memory used, which reduces the number of runs and the subsequent merge effort. Second, replacement selection exhibits steady I/O behavior during run generation rather than I/O in great bursts. This improves the utilization of I/O devices. Third, it very effectively exploits pre-sorting to produce longer runs, i.e., input sequences that are not random but somewhat correlated to the desired sorted output sequence. In particular, if the input is already sorted, a single run will be generated. Fourth, sorting is often done as part of grouping with aggregation or for duplicate removal. A technique called early aggregation can then be applied, which reduces I/O significantly. Early aggregation achieves much higher I/O reduction if runs are created by replacement selection.
The following sections examine the selection step of replacement selection using a binary selection tree. The first section considers a tree formed with fixed length records, and a subsequent section examines the tree with variable length records.
Selection Tree for Fixed Length Records
The selection step is the most expensive part of the replacement selection operation. The selection step can be performed efficiently with the help of a selection tree, using the run number and the record key as the (compound) selection key. A selection tree for N records is a left-complete binary tree with N external nodes and N−1 internal nodes stored in an array without pointers. Logically, each external node or tree “leaf” stores a data record with a combined sort key consisting of a run number and the original sort key of the record. The combined sort key is here called a run-key pair and represented by a notation “run number, key” such as “1,12”. Each internal node stores the lesser of the sort keys of its two children nodes plus a reference to the source node of the key. Physically, the selection tree and nodes can be implemented in several different ways.
FIGS. 3
a
and
3
b
show a selection tree
30
before and after a, selection replacement operation for replacing the lowest key on the first run (i.e., key
12
for run
1
).
FIG. 3
a
shows the initial state of the selection tree
30
with five keys (i.e.,
5
,
12
,
15
,
22
, and
32
). The key of the last record output was 8. External nodes or tree “leafs” are drawn as rectangles and internal tree nodes are shown as ellipses. The number to the left of a node indicates its position in the array.
Initially, the record represented by external node (
1
,
12
) is selected and output. Its key value “
12
” is now the last key value output, and this value is recorded. Outputting the record frees up the external node at position
9
. A new record is input and stored in this node. The key of the new record is
30
, which is higher than the last key value output (
12
) so the new record “fits” in the current run. Its run number is set to 1, resulting in the pair (
1
,
30
) shown in position
9
. Next, a new minimum key is found by traversing the path from the external node at position
9
up to the root at position
1
. At each node, the sort keys of two sibling nodes are compared and the node with the lower sort key is promoted to the parent node. For example, for external nodes (
1
,
32
) and

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Run formation in large scale sorting using batched... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Run formation in large scale sorting using batched..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Run formation in large scale sorting using batched... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3054409

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.