Storing multidimensional data in a relational database...

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S793000, C707S793000, C707S793000, C707S793000, C707S793000

Reexamination Certificate

active

06484179

ABSTRACT:

FIELD OF THE INVENTION
The present invention relates to relational database management systems and, more specifically, to techniques for storing multidimensional data in relational database management systems.
BACKGROUND OF THE INVENTION
In the context of database systems, a “dimension” is a list of values that provide categories for data. A dimension acts as an index for identifying values of a variable. For example, if sales data has a separate sales figure for each month, then the data has a MONTH dimension. That is, the data is organized by month. A dimension is similar to a key in a relational database. Data that is organized by two or more dimensions is referred to as “multidimensional data”.
Any item of data within a multidimensional variable can be uniquely and completely selected by specifying one member from each of the variable's dimensions. For example, if a sales variable is dimensioned by MONTH, PRODUCT, and MARKET, specifying “January” for the MONTH dimension, “Stereos” for the PRODUCT dimension, and “Eastern Region” for the MARKET dimension uniquely specifies a single value of the variable. Thus, dimensions offer a concise and intuitive way of organizing and selecting data for retrieval, updating, and performing calculations.
Multidimensional data may be stored in relational database systems (“ROLAP” systems) or in specialized, “multidimensional” database systems (“MOLAP” systems). Multidimensional database systems provide structures and access techniques specifically designed for multidimensional data, and therefore provide relatively efficient storage and access to multidimensional data. However, when stored in specialized multidimensional database systems, only applications that are specially built to interact with those multidimensional database systems are able to access and manipulate the data.
On the other hand, when stored in relational database systems, all applications that support interaction with relational databases have access to the data. Such database applications communicate with the relational database system by submitting commands that conform to the database language supported by the relational database system, the most common of which is the Structured Query Language (SQL).
Relational database systems store data in the form of related tables, where each table has one or more columns and zero or more rows. The conventional mechanism for storing multidimensional data in a relational database system is to store the data in tables arranged in what is referred to as a star schema. In relational database systems, a star schema is distinguished by the presence of one or more relatively large tables and several relatively smaller tables. Rather than duplicating the information contained in the smaller tables, the large tables contain references (foreign key values) to rows stored in the smaller tables. The larger tables within a star schema are referred to as “fact tables”, while the smaller tables are referred to as “dimension tables”.
FIG. 1
illustrates an exemplary star schema with two dimensions.
Referring to
FIG. 1
, it illustrates a database
100
that includes tables
102
,
104
and
106
. Table
102
is named “store” and contains information about each of the stores in which sales may occur. Each row in store table
102
contains a unique store-id and information about the particular store that corresponds to the store-id. Table
104
is named “product” and contains information about each type of product that may be sold in any of the stores. Each row in product table
104
contains a unique product-id and information about the particular product.
Table
106
is named “sale” and contains information about each sale in each of the stores represented in the store table
102
. Each row in sale table
106
includes a dollar amount, a store-id to indicate the store at which the sale was made, a product-id to indicate the product sold in the sale, and the date of the sale. Typically, the number of sales will be vastly greater than both the number of stores at which the sales are made and the number of products carried by the stores. Detailed information about the store and product involved in a sale transaction does not have to be stored in the rows of table
106
because such detailed information is available in tables
102
and
104
, respectively. Instead, the rows of table
106
simply contain values (store-ids and product-ids) that reference information stored in the other tables
102
and
104
. Therefore, tables
102
,
104
and
106
constitute a star schema in which table
106
is the fact table and tables
102
and
104
are dimension tables.
The data stored in fact table
106
only has two dimensions, and therefore fact table
106
only has two columns dedicated to storing foreign key values for those dimensions. In general, a fact table must dedicate one column for storing foreign key values for each of the dimensions associated with the multidimensional data stored in the fact table. Thus, a fact table that stores data associated with twenty dimensions would have to dedicate twenty columns to the storage of foreign key values.
Storing multidimensional data within a relational database has two significant drawbacks. First, the fact table is significantly larger than it would have to be if it only had to store the multidimensional data itself. The massive size of the fact table, relative to the dimension data itself, is largely due to the need to store a foreign key value for each dimension for each multidimensional value. Second, the rows within a conventional fact table have no particular order. Consequently, multidimensional values that are closely related to each other conceptually may be stored relatively randomly throughout the entire fact table. This leads to inefficiencies because multidimensional data that is conceptually related to each other are frequently accessed and manipulated as a group.
An alternative approach to managing multidimensional data in a relational database involves storing the data in relational files but maintaining all multidimensional structure, metadata, administration, and access control using multidimensional database system techniques. Accessing relationally-stored data using multidimensional techniques poses numerous difficulties. For example, when all administration and access to the multidimensional data are controlled exclusively through the multidimensional database system engine, two database management systems must be administered. Further, database applications that access data using conventional relational commands (e.g. SQL commands) are unable to access the multidimensional data.
The approaches described above for storing multidimensional data in relational database systems demonstrate the tradeoffs made by prior approaches, which have either (1) sacrificed the benefits of multidimensional storage to enjoy the benefits of modem relational systems, such as conventional relational access, or (2) sacrificed the benefits of relational storage to attain the efficiency of multidimensional storage.
SUMMARY OF THE INVENTION
Techniques are provided which address the problems associated with the conventional approaches for storing multidimensional data in a relational database system. According to one aspect of the invention, the many foreign key values of each row in the fact table are mapped to and replaced by a “replacement” value. A mapping function is provided that allows the database server to derive a replacement value from any given combination of foreign key values, and an inverse mapping function is provided to reproduce the combination of foreign key values given the replacement value.
According to another aspect of the invention, a mapping function is selected such that the foreign key value combinations of multidimensional values that are conceptually related to each other map to values that are close to each other. The rows in the fact table are then stored within the fact table in the sorted order, where the replacement value derived from the foreign key values is used as th

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Storing multidimensional data in a relational database... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Storing multidimensional data in a relational database..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Storing multidimensional data in a relational database... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2937839

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.