Data processing: database and file management or data structures – Database design – Data structure types
Reexamination Certificate
1999-07-08
2002-12-31
Vu, Kim (Department: 2172)
Data processing: database and file management or data structures
Database design
Data structure types
C707S793000, C707S793000, C707S793000
Reexamination Certificate
active
06502088
ABSTRACT:
FIELD OF THE INVENTION
The present invention relates to computer systems, and more particularly to the access of databases in the computer system.
BACKGROUND OF THE INVENTION
Databases are an important tool for the storage and management of information for businesses. Both relational and non-relational databases exist for this purpose. Examples of relational databases include ORACLE, DB2, AND INFORMIX. Examples of non-relational databases include custom databases created with the 4690 operating system, developed by INTERNATIONAL BUSINESS MACHINES CORPORATION. The 4690 operating system allows programmers to created custom databases which support keyed, sequential, and binary file types. Database Management Systems (DBMS) provide users the capabilities of controlling read/write access, specifying report generation, and analyzing usage.
Some businesses, particularly large organizations, maintain multiple database types, both relational and non-relational. The simultaneous access to these databases then becomes an issue. For example, a company may wish to combine and compare customer preference information in the marketing department's ORACLE database with the current customer issues tracked in the customer service department's DB2 database. One conventional way of accessing these databases is through the DB2 DATAJOINER (“DataJoiner”) products, developed by INTERNATIONAL BUSINESS MACHINES CORPORATION. DataJoiner is a multi-database server which uses the Structured Query Language (SQL) and Open Database Connectivity (ODBC), to provide client access to diverse data sources that reside on different platforms.
FIG. 1
illustrates a computer system which uses DataJoiner. The computer system
100
comprises clients
102
in various platforms, a DataJoiner server
104
, and data sources
106
of various types. The Data Joiner server
104
allows the clients
102
to transparently access data from the different data sources
106
.
FIG. 2
is a block diagram illustrating the basic architecture of DataJoiner. DataJoiner
202
is a DBMS which includes ODBC software and drivers for a variety of data sources. DataJoiner
202
allows for transparent access to multiple data sources in a heterogeneous environment. The Generic Access Applications Programming Interface (API)
204
allows for customized access to home-grown or other data sources
208
not already supported by DataJoiner. Generic Data Access Modules
206
containing Custom ODBC compliant drivers for these data sources may be written and installed on the same system as DataJoiner
202
and used by DataJoiner
202
to access the data sources
208
. The Generic Access API
204
defines the calls which must be provided by the custom ODBC drivers in order for DataJoiner
202
to access the custom data source
208
.
However, DataJoiner is not able to provide certain features for access to the non-relational databases. In particular, DataJoiner
202
does not provide (1) a two-phase commit, (2) performance of SQL operations from within a single application across heterogeneous file systems, (3) referential integrity, and (4) indices for queries across heterogeneous file system which include non-relational data sources, such as
4690
keyed, sequential, and binary file systems.
Two-Phase Commit
DataJoiner can perform data changes across multiple databases sources within a single operation, while guaranteeing that changes to one source can be rolled back if changes to another source within the same transaction fail. The two-phase commit is the mechanism for implementing this. For example, someone may be trying to wire some money electronically to another location. A debit would need to be written against one account maintained by a first DBMS, and a credit would have to be reflected in another account maintained by a second DBMS. The two-phase commit protocol ensures that either all of the DBMS involved commit the changes required by the transaction or none do. However, keyed, sequential, and binary file systems cannot participate in this function.
SQL Operations from within a Single Application
Conventional relational DBMS's employ two underlying data storage concepts in order to facilitate relational operations on that data: (1) data is stored in tables with each table having a fixed number of columns; and (2) system catalogs, or data dictionaries, are generated for each table. These data dictionaries (metadata) describe the column layout, index structure, and other data access information. These concepts allow one application to create data, and other applications totally unrelated to the first, to analyze and update the data.
However, with keyed, sequential, and binary file systems, there are several inhibitors that prevent these file systems from having the above flexibility. These include: (1) data is not held in a fixed table format; (2) data access is via a buffer that passes the entire record, with no public or obvious field delineation available; and (3) the structure, or field layout, of keyed, sequential, and binary records is not stored in a publicly available catalog. With these inhibitors, relational operations cannot be performed on the keyed, sequential, and binary file systems, making SQL operations from within a single application across heterogeneous file systems which include these file systems difficult or impossible.
Referential Integrity
Referential integrity is a technique where the consistency of data within a DBMS is ensured by referring to one piece of data during an operation on another piece of data. An example would be to refer to a department table to ensure that a valid department number is being used when a new employee is being added to the database. Referential integrity is incorporated in many of the conventional relational DBMS and is being enforced at the database level to ensure that all applications adhere to any constraints and thus ensure the integrity of the data.
However, the concept of referential integrity, and a file system mechanism to enforce it, does not exist for keyed, sequential, and binary file systems. Without this function, heterogeneous DBMS can not implement referential integrity between the relational and non-relational data. Using the above example, if the valid department list is in a keyed file system and the new employee data was to be stored in a relational database, there would be no way that the underlying data stores could enforce the referential integrity during the employee add operation.
Indices for Queries
A need often arises to access keyed and sequential data based on data selection criteria different from the key or record number traditionally used to access the data. Queries based on this type of selection criteria are referred to as ad-hoc queries. For instance, instead of retrieving an employee record by an employee number there may be a need to retrieve all of the employees who are employed in department “X”. The employee records are keyed by an employee identification number yet the selection criteria is based on the department number in which the employee is employed.
For relational databases, ad-hoc queries may be efficiently satisfied since indices of the data may be created for data elements other than the key or record number. Once created; these indices may be used as an alternative form of data access which provides a more efficient processing of the query.
However, to satisfy queries using conventional file access mechanisms available for hash-keyed or sequential files, every record in the file must be read and compared to the selection criteria since indices are not available for these files. This causes performance problems both in accessing the file and in creating the network traffic when the querying application resides on a machine other than the source data files.
Accordingly, there exists a need for a method and system for improved access to non-relational databases in a computer system. The method and system should increase the flexibility and efficiency of access to the databases. The present invention addresses such a need.
SUMMA
Gajda Kimberly Lynn
Hess Bradley Allan
Kemp Julie Anne Hill
Lewis Jerry Jay
Thorpe Robert Tod
International Business Machines - Corporation
Ly Anh
Sawyer Joseph A.
Schelkopf J. Bruce
Vu Kim
LandOfFree
Method and system for improved access to non-relational... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and system for improved access to non-relational..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and system for improved access to non-relational... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2987434