Data processing: database and file management or data structures – Database design – Data structure types
Reexamination Certificate
2000-08-21
2004-06-22
Breene, John (Department: 2177)
Data processing: database and file management or data structures
Database design
Data structure types
C707S793000, C705S029000
Reexamination Certificate
active
06754666
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field Of The Invention
The present invention relates to storage and access of data in a database, and, more particularly, to a method and apparatus and data structures used therein such that wherein data elements with dissimilar structures may be stored in one table, with minimal wasted space, to allow for efficient data access, each data element being stored in a record of the table and having a category designation to associate the record with a set of attributes that are stored in another location, a category hierarchy is provided and allows for attribute inheritance of a parent's attributes by a child category.
2. Description Of The Related Art
A DBMS (Database Management System) is used to manage data and is comprised of computer-executable code that may be used to define the structure of data and to access data within the defined structure. One example of a DBMS is a relational DBMS, or RDBMS. An RDBMS manages tables that make up a relational database as well as the data contained in the tables. In an RDBMS, data is organized in rows (or records) and columns (or fields) of the tables, and two or more tables may be related based on like data values. The intersection of a row and column in a table is referred to as a cell and contains the data value for a particular field of a particular record.
A DML (data manipulation language) such as SQL (Structured Query Language) is typically used to store, retrieve and modify data in a table. A schema defines the structure of a database, i.e., each table and the fields within a record of a table. A schema is itself considered data that is stored in one or more tables. Therefore, like other data in a database, a DML may be used to store, retrieve and modify the data in the database as well as the structure of a database.
In addition to an RDBMS, other examples of data management approaches include file management systems, flat files and hierarchical database management systems. There are, however, shortcomings with the existing data management approaches.
One such disadvantage has to do with the use of hierarchically-related data in an RDBMS. A hierarchy of data necessarily involves parent/child relationships. For example, a data item can be a parent and/or a child of another data item. In a conventional RDBMS, a row in one table may be related to a row in another table. However, the relationship represents a peer (or same-level) relationship rather than a hierarchical (or parent/child) relationship. Consequently, complicated and/or nonstandard mechanisms are used in the conventional DBMS to represent a parent/child hierarchical relationship using the simple row/column tabular structure of a relational database.
FIGS. 1A through 1C
provide an example of a hierarchy in the form of a tree
140
with nodes
100
to
105
. Node
100
is a root node, or node with no parent node in the hierarchy. Node
100
and nodes
101
to
103
are internal nodes (i.e., nodes with at least one child). Nodes
104
and
105
are leaf nodes (i.e., a node that has no children).
In an RDBMS, a “single table” approach may be used to represent the hierarchy depicted in
FIG. 1A
wherein each node is a record of a table and each record contains one or more fields to store the hierarchical relationships with other nodes. A disadvantage of this approach is that it is very difficult to dynamically alter the structure of the database once it is populated with data. Thus, it is necessary to pre-allocate a specific number of fields that may be needed to store the hierarchical relationships which may be inaccurate. If there are not enough fields allocated for storing hierarchical relationships, the structure of the database (e.g., the structure of the table containing the fields that store the hierarchical relationships) must be altered before storing additional relationships in the database. Conventional approaches to re-structuring a populated database requires that existing data be moved to new tables and introduces the possibility of lost data. Thus, it has been thought to be better to err on the side of too many fields. However, this approach results in wasted space as not all records will use the maximum number of fields allocated to store the hierarchical relationships.
Referring to
FIG. 1B
, database
141
includes a record (in one or more tables of the database) for each of nodes
100
through
105
. Specifically, record
110
corresponds to node
100
and contains a field
130
that is used to identify record
110
(i.e., a record identification, or record ID) in the database. Field
130
is also included in records
111
through
113
and contains the record ID of record
110
indicating that records
111
through
113
are child records of record
110
. Records
111
through
113
each include a record ID in fields
121
through
123
, respectively. Since record
114
is a child of record
111
, it includes field
121
that contains the record ID of record
111
.
Node
105
of tree
140
has two parents (i.e., nodes
102
and
103
). To reflect this structure, record
115
of database
141
must include two fields (i.e., fields
122
and
123
) that each refer to one of the parent records of record
115
. Since there is no easy mechanism for dynamically altering the structure, or schema, of the table, an attempt must be made to accurately predict the number of parents a record may have when creating the database schema. As discussed above, there are tradeoffs and drawbacks with both a low and high prediction. Further, it is virtually certain that there will be wasted space no matter what the prediction.
Another approach used in an RDBMS borrows from the file management approach that is used in most operating systems and is hierarchical in nature. In a file management system, the nodes consist of a directory and a file. Each directory is an internal node of the hierarchy, while each file is a leaf node of the hierarchy. Where a file (or directory) has multiple parents, multiple copies are needed such that a copy of the file (or directory) is stored in each of the parent directories.
Using this approach in an RDBMS, each node of the hierarchy is represented as a table in the RDBMS. However, the number of tables increases with the number of internal nodes, which tends to be proportional to the size of the hierarchy. It is also extremely difficult and complex to navigate between the nodes of the hierarchy since each node is located in a separate table.
Using this approach, for example, records
110
through
115
in
FIG. 1B
become single-record tables with two tables created for record
115
.
FIG. 1C
illustrates the new structure used for record is
115
. Each of single-record tables
155
and
156
contain a copy of the data contained in record
115
with the exception of fields
122
and
123
. Table
155
includes field
122
that identifies Node
102
as its parent, and table
156
includes field
123
to identify node
103
as its parent. Tables
155
and
156
are duplicates with the only difference between the two tables being the reference to the parent table.
Thus, a disadvantage of the file management approach is that when, as in the case of node
105
, a node with multiple parents, multiple copies of the node need to be maintained as separate tables. This is wasteful of space and difficult to manage, and data integrity can easily be violated if an update is not reflected in all of the copies of the node it affects.
Hierarchical databases are an efficient approach to storing records that relate to one another in a hierarchical manner. However, records are not stored in tables and thus hierarchical databases do not enjoy the benefits of the relational model. For example, hierarchical databases do not have the ability to relate a table to one or more other tables based on like cell values.
In addition to the inefficient storage of hierarchical structures in an RDBMS, another disadvantage with existing DBMSs is in the manner in which data is stored and searched. Relational databases are designed to store table
Brookler David E.
Hazi Ariel
Sullivan Dave L.
Tham Dominic
Tinari Philip A.
A2i, Inc.
Breene John
Dalina Law Group P.C.
Mayo Joseph J.
Wassum Luke S
LandOfFree
Efficient storage and access in a database management system does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Efficient storage and access in a database management system, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Efficient storage and access in a database management system will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3348357