Data processing: database and file management or data structures – Database design – Data structure types
Reexamination Certificate
1998-04-16
2001-08-07
Breene, John (Department: 2777)
Data processing: database and file management or data structures
Database design
Data structure types
C707S793000, C707S793000, C707S793000
Reexamination Certificate
active
06272486
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates in general to computer-implemented database systems, and, in particular, to determining the optimal number of tasks for building a database index in a (virtual) memory constrained environment.
2. Description of Related Art
Databases are computerized information storage and retrieval systems. A Relational Database Management System (RDBMS) is a database management system (DBMS) which uses relational techniques for storing and retrieving data. Relational databases are organized into tables which consist of rows and columns of data. The rows are formally called tuples. A database will typically have many tables and each table will typically have multiple tuples and multiple columns. The tables are typically stored on direct access storage devices (DASD) such as magnetic or optical disk drives for semi-permanent storage.
A table can be divided into partitions, with each partition containing a portion of the table's data. By partitioning tables, the speed and efficiency of data access can be improved. For example, partitions containing more frequently used data can be placed on faster data storage devices, and parallel processing of data can be improved by spreading partitions over different DASD volumes, with each I/O stream on a separate channel path. Partitioning also promotes high data availability, enabling application and utility activities to progress in parallel on different partitions of data.
An index is an ordered set of references to the records or rows in a database file or table. The index is used to access each record in the file using a key (i.e., one of the fields of the record or attributes of the row). However, building an index for a large file can take a considerable amount of elapsed time. The process involves extracting a key value and record identifier (rid) value from each of the records, sorting all of the key/rid values, and then building the index from the sorted key/rid values. Typically, the extracting, sorting, and index build processes are performed serially, which can be time consuming in the case of a large database file. Additionally, even if some of the tasks are performed in parallel, due to memory constraints, there could be inefficiencies in the processes.
When data is loaded or reorganized, indexes are built that provide access to the data. Building these indexes, however, can be very time consuming. Additionally, when computer systems fail, indexes could be corrupted or destroyed, and recovery of the indexes, which involves rebuilding each index, can be very time consuming. Therefore, there is a need in the art for techniques that build indexes more efficiently.
SUMMARY OF THE INVENTION
To overcome the limitations in the prior art described above, and to overcome other limitations that will become apparent upon reading and understanding the present specification, the present invention discloses a method, apparatus, and article of manufacture for a computer-implemented building indexes system. In accordance with the present invention, a database is stored in a data storage device coupled to a computer. An amount of available memory is determined. An amount of memory for use in transmitting data between extract, sort, and index build tasks is determined. Then, a number of sort tasks to be used to build indexes is determined based on the determined amount of available memory, the determined amount of memory for use in transmitting data between tasks, and task memory requirements.
An object of the invention is to provide a more efficient index building system. Another object of the invention is to determine the number of sort tasks that can be invoked to build indexes. Yet another object of the invention is to determine the number of extract tasks that can be invoked to build indexes.
REFERENCES:
patent: 5204958 (1993-04-01), Cheng et al.
patent: 5386583 (1995-01-01), Hendricks
patent: 5467471 (1995-11-01), Bader
patent: 5495608 (1996-02-01), Antoshenkov
patent: 5537622 (1996-07-01), Baum et al.
patent: 5546571 (1996-08-01), Shan et al.
patent: 5560007 (1996-09-01), Thai
patent: 5579515 (1996-11-01), Hintz et al.
patent: 5611076 (1997-03-01), Durflinger et al.
patent: 5666525 (1997-09-01), Ross
patent: 5680607 (1997-10-01), Brueckheimer
patent: 5842197 (1998-11-01), Ho
patent: 5842208 (1998-11-01), Blank et al.
patent: 5852822 (1998-12-01), Srinivasan et al.
patent: 5873091 (1999-02-01), Garth et al.
patent: 5918225 (1999-06-01), White et al.
Balakrishna R. Iyer, et al., “Percentile Finding Algorithm For Multiple Sorted Runs”, Proceedings of The Fifteenth International Conference on Very Large Data Bases, pp. 135-144, 1989.*
Aronoff, Eyal, et al, Advanced Oracle Tuning and Administration, Osborne McGraw-Hill, pp. 161, 181, 255, 266, 322, 328, 350, 354, 406, Dec. 1997.*
Stevens, W. Richard, Advanced Programming in the UNIX Environment, Addison-Wesley Publishing Co., pp. 73-75, 427-430, Dec. 1992.*
Taylor, Dave, et al, Sams' Teach Yourself UNIX in 24 Hours, Sams Publishing, pp. 80-90, Dec. 1997.
Garth John Marland
Ruddy James Alan
Breene John
International Business Machines - Corporation
Pretty, Schroeder & Poplawski, P.C.
Robinson Greta
LandOfFree
Determining the optimal number of tasks for building a... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Determining the optimal number of tasks for building a..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Determining the optimal number of tasks for building a... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2439589