Data processing: database and file management or data structures – Database design – Data structure types
Reexamination Certificate
2000-04-14
2003-05-13
Corrielus, Jean M. (Department: 2172)
Data processing: database and file management or data structures
Database design
Data structure types
C707S793000, C707S793000, C707S793000
Reexamination Certificate
active
06564204
ABSTRACT:
FIELD OF THE INVENTION
The present invention relates to generation of join query results in the management and execution of relational database queries.
BACKGROUND OF THE INVENTION
Relational Database Management System (RDBMS) software using a Structured Query Language (SQL) interface is well known in the art. The SQL interface has evolved into a standard language for RDBMS software and has been adopted as such by both the American Nationals Standard Organization (ANSI) and the International Standards Organization (ISO).
In RDBMS software, all data is externally structured into relations, each relation dealing with one or more attributes and comprising one or more tuples of data, each tuple associating attribute values with each other. A relation can be visualized as a table, having rows and columns (indeed, the relations in a particular database are often referred to as the “tables” of the database). When a relation is visualized as a table, the columns of the table represent attributes of the relation, and the rows of the table represent individual tuples or records that are stored using those attributes. To aid in visualizing relations in this manner, In the following, the relations in an RDBMS system will frequently be referred to as that system's “tables”.
An RDBMS system may be used to store and manipulate a wide variety of data. For example, consider a simple RDBMS system for storing and manipulating name and address information. In such a system, a “Name/Address” table storing name and address attributes might have a first column “Last Name” for the last name, a second column “First Name” for the first name, a third column “M.I.” for middle initial, and further columns for addresses. Each row in the table would include the name and M address data for a particular individual.
Often, columns in different tables are related. Thus, in the above example, the “Name/Address” table might have a column for a street address, and a column for a ZIP code or other postal code (i.e., a unique identifying index number or character string that identifies a general location of a street address). In this example, each row of the “Name/Address” table identifies a postal code for the individual described by that row, but does not identify a city. A second “City” table might store city names; in two columns, a first column for the city name and a second column for the postal code of the city. Note that there may be multiple persons in each postal code and so the “Name/Address” table is likely to have multiple entries identifying the same postal code. Furthermore, there are likely to be multiple postal codes in the same city, so the “City” table is likely to have multiple rows for the same city.
It should be noted that the order of tuples in a relation is not considered a feature of the relation; that is, two relations having the same tuples, but in a different order, are entirely equivalent to the RDBMS system. Tuple order is determined by user commands, if any, that specify a manner in which to sort the tuples. Absent a specified sort order, the order of the tuples in a relation is not defined, and tuples from the relation may be reported in any order.
An overall database organization is typically referred to as a schema for the database. A database schema is often compactly expressed using table names and names of columns in the table. Thus the simple schema including “Name/Address” and “City” tables described in the above example, could be expressed as:
Name/Address(LastName,FirstName,M.I.,PostalCode, . . . )
City(City,PostalCode)
Database schema often take the form of a “star”, where there is one very large “mother” table and many small detail tables. Queries typically involve selections based on the attributes of the detail tables, e.g. the City table storing city names, followed by retrieval of information for the selected tuples from the mother table, e.g. the Name/Address table storing names and addresses for persons. From the foregoing description, it can be seen that to find the persons in a particular city, the rows in the “City” table would be scanned to find those rows having the desired city name, and then the postal codes in those rows would be retrieved. Then, the “Name/Address” table would be scanned to locate all rows from the table having the postal code in the postal code column. The names and addresses for the selected rows are the individuals residing in the desired city.
A typical way of looking up information in tables uses indexes. For example, there may be an index into the Name/Address table identifying all of the rows that have a particular value for a postal code. This index is stored separately from the table and must be updated each time the table itself is updated. Accordingly, indexes introduce a substantial increase in storage requirements. However, if no index is available, a query into a table can only be satisfied by scanning each row of the table, which requires substantially longer processing time. In an environment such as decision support, where selections may be on an arbitrary number of detail tables, maximum speed requires indices on most or all columns of some tables, making such applications space-intensive. Typically, in other environments, performance is compromised to reduce storage requirements, by not providing a column index for at least some columns of a table.
One type of index is a bitmap index, which indicates whether a specific value exists for each row in a particular column. One bit represents each row. Thus, in the bitmap index for the value “45246” in the column “Postal Code,” the nth bit equals 1 if the nth row of the Name/Address table contains a postal code of “45246”, or 0 if that row holds a value other than “45246”. Typically there are multiple bitmap indexes for each column, one for each of several values that may appear in the column (e.g., one index for the value “45246”, a second index for the value “45202”, and so on). Another type of index is an encoded vector index (EVI), disclosed U.S. Pat. No. 5,706,495, issued Jan. 6, 1998 to Chadha et al., entitled ENCODED-VECTOR INDICES FOR DECISION SUPPORT AND WAREHOUSING, which is incorporated by reference. An EVI serves a similar purpose as a bitmap index, but only one index is necessary to account for all the values occurring in the column (whether they be “45246, ” “45202, ” or any other). In an EVI on the “Postal Code” column, the nth position of the EVI contains a bit code, that can be decoded using a lookup table to produce the value “45246”, which is the postal code in the nth row of the table. Thus, whereas a separate bitmap index is required to map each particular key value in a database field, only one EVI is required to represent the same information. Thus, an EVI saves computer memory by including all possible key values for a given field in one database index. Notably, however, both a bitmap index and EVI index indexes only information relating to a single column of the table. These indexes do not reflect the relations between values in multiple columns.
Turning now to a description of SQL, the main RDBMS operation described in the preceding examples is known as a JOIN, in the illustrated case, between the “City” table and “Name/Address” table. This is one example of the many operators that are provided by an SQL interface to RDBMS software. The SQL interface allows users to formulate relational operations on tables either interactively, in batch files, or embedded in host languages such as C, COBOL, etc. Operators are provided in SQL that allow the user to manipulate the data; each operator operates on either one or two tables and produces a new table as a result. The power of SQL lies on its ability to link information from multiple tables or views together to perform complex sets of procedures such as the simple procedure in the above example, with a single statement.
The operators provided by SQL are derived from an original set of eight operators:
RESTRICT Extracts specified tuples from a specified relation (i.e., retrieves specified rows from a table)
Abdo Abdo Esmail
Amundsen Lance Christopher
Bestgen Robert Joseph
Driesch, Jr. Robert Douglas
Toft Daniel Virgil
Corrielus Jean M.
International Business Machines - Corporation
Liang Gwen
Wood Herron & Evans LLP
LandOfFree
Generating join queries using tensor representations does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Generating join queries using tensor representations, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Generating join queries using tensor representations will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3055568