Data processing: database and file management or data structures – Database design – Data structure types
Reexamination Certificate
1998-09-14
2001-09-25
Corrielus, Jean M. (Department: 2172)
Data processing: database and file management or data structures
Database design
Data structure types
C707S793000, C707S793000, C707S793000, C707S793000, C707S793000
Reexamination Certificate
active
06295539
ABSTRACT:
FIELD OF THE INVENTION
The present invention relates to database management systems, and particularly to a method for dynamically determining an optimal process for enforcing constraints.
BACKGROUND INFORMATION
A well known database software program is DATABASE 2 (DB2) database software distributed by IBM Corporation. As is known in the art, DB2 operates as a subsystem in a computer system operating under the IBM MVS operating system software. In a DB2 environment, user data resides in DB2 tables which are in tablespaces. A tablespace is, for example, a portion of storage space in a direct access storage device (DASD) such as a disk drive. As is known in the art, address space and dataspace refer to virtual storage in an IBM MVS operating system, address space being capable of executing instructions while dataspace is a subset of address space and is limited to storing data.
For exemplary purposes, illustrated below is an order_entry table that would be stored in a tablespace. The order_entry table contains columns: customer_number; product_code; order_number; buyer_name; and ship_to_zip.
customer_number
product_code
order_number
buyer_name
ship_to_zip
1111111111
0010
1234500001
John Doe
60606
1111111111
0040
1234500002
Jane Doe
70707
3333333333
0020
1234500003
Bill Smith
90909
2222222222
0030
1234500004
Fred Smith
80808
While the above order_entry table shows four rows, the table could have millions of rows for all the orders of a company, for example 4 million rows. The order_entry table also has, for example, three index keys and two foreign keys. An index key is an identifier for a particular row of a table while a foreign key also identifies a row but in addition is used for referential integrity as described below. For example, in the order_entry table, one index key could be based on order_number, another index key based on buyer_name and a third index key based on ship_to_zip.
As is known in the art, an index key for a particular table indicates a row identification (RID) and a selected value for the row (e.g., the index key value). The index key can be used to generate an index for the table which facilitates subsequent searches for particular data in the table. For example, the order_entry table would have three indexes (e.g., one for each index key), each index being stored in an indexspace. Similar to a tablespace, an indexspace is, for example, a designated portion of a DASD. Thus, if a user was looking for rows that contain a particular buyer name in the order_entry table, the database management system could query the buyer index for the table to identify all occurrences of the buyer name without reading the entire table to locate the rows.
As is known in the art, each table in a database may be either a parent table, a child table or both. A child table is related to a parent table via the foreign key value or values contained in columns of the child table. For example, a foreign key value can appear multiple times in a child table (e.g., multiple rows in a child table can have the same foreign key, such as the customer_number and product_code entries in the order-entry table) but each foreign key must be associated with a unique key in a parent table of the child table. Referential integrity ensures that every foreign key value is valid (e.g., has a corresponding primary key in a parent table). Thus, referential integrity (RI) means that a value in the column of a row in the table is valid when this value also exists in an index of another table. A row should not be in a table if it violates a constraint. As the order_entry table illustrated above has two foreign keys, it has for example, a RI constraint on customer_number and product_code. As is known in the art, when a user of a DB2 database management system creates a table, the user also defines the constraints for the table (e.g., the user can define the relational integrity criteria).
Illustrated below is a product table and a customer table (e.g., the parent tables for the foreign keys in the order_entry table).
Product Table
product_code
product_description
retail_price
00010
laptop pc
1000.00
00020
desktop pc
1100.00
00030
office pc
1200.00
00040
lan pc
3500.00
00050
home pc
999.99
The product table shows five rows, although the table could have thousands of rows for all of the different products of a company. The product table has a unique index on the column product_code, which is illustrated in ascending order. The values in the column product_code are each unique since there is only one product code assigned to each product and thus in this table, a product code would not be included more than once. Accordingly, an index for the product table would include the key (e.g., the stored value in the product_code column) and a RID. The product table index would reside in a DB2 indexspace.
The customer table illustrated below shows four rows, although this table could also have thousands of rows for all of the customers of a company. The customer table has a unique index on the column customer_number, illustrated in ascending order. The values in the column customer_number are each unique since there is only one customer number assigned to each customer name and thus a customer number would not be included in this table more than once. Accordingly, an index for the customer table would include the key (e.g., the value of the column customer_number) and a RID. The customer index would also reside in a DB2 indexspace.
Customer Table
customer_number
buyer_name
customer_address
1111111111
John Doe
State A
2222222222
Fred Smith
State B
3333333333
Bill Smith
State C
4444444444
Steve Jones
State D
As shown by the above tables, all of the rows in the order_entry table are valid because the foreign key values in the column product_code exist in the index of the product table and the values in the column customer_number exist in the index of the customer table.
Conventional database management systems, such as DB2, provide the user with the ability to identify specific conditions that a row must meet before it can be added to a table. These conditions are referred to as “constraints” because they constrain the values that a row may include. Constraints include, for example, check constraints and referential integrity constraints. Check constraints include, for example, qualifying criteria for a particular value, such as a zip code value being in the range of 00000 to 99999.
Constraint checking is required at various times. For example, when a copy is made of a loaded database table or when a database table is recovered after a failure, DB2 will not allow access to the table until constraint enforcement is performed. Constraint checking is generally performed by a check utility, e.g., an utility designed to perform constraint checking. Examples of conventional check utilities are CHECK DATA by International Business Machines of Armonk, N.Y. and CHECKPLUS by BMC Software Co. of Houston, Tex.
FIG. 1A
illustrates a prior art method for checking check constraints and referential integrity (RI) constraints. As described below, constraint checking for referential integrity performed by conventional check utilities involves execution of a sort task. A sort task places foreign key values in a collated arrangement to facilitate comparison of the foreign key values with the values in the corresponding parent index.
The operation of a conventional check utility involves reading the tablespace to extract foreign keys, passing the foreign keys to a sort operation for collating and then passing the collated foreign keys back to the check utility which reads the appropriate parent index and checks the foreign key values for validation. Reading a tablespace and a parent index in a MVS mainframe environment, required when a sort task is used for constraint checking, involves I/O operations. Each I/O operation could take as long as 35 ms. In contrast, a cpu in a MVS mainframe environment could execute 35 million to 65 million instructions per second. Therefore, utilization of I/O operations in constraint checking is a significant
Baker & McKenzie
Computer Associates Think Inc.
Corrielus Jean M.
LandOfFree
Dynamic determination of optimal process for enforcing... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Dynamic determination of optimal process for enforcing..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Dynamic determination of optimal process for enforcing... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2529682