Data processing: database and file management or data structures – Database design – Data structure types
Reexamination Certificate
1998-05-06
2001-04-03
Ho, Ruay Lian (Department: 2771)
Data processing: database and file management or data structures
Database design
Data structure types
C707S793000
Reexamination Certificate
active
06212524
ABSTRACT:
COPYRIGHT NOTICE
A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by any one of the patent disclosure, as it appears in the Patent and Trademark Office patent files or records, but otherwise reserves all copyright rights whatsoever.
THE FIELD OF THE INVENTION
This invention relates to the field of databases. In particular, the invention relates to creating databases, and loading and accessing data in the databases.
BACKGROUND OF THE INVENTION
Many different types of databases have been developed On line transaction processing (OLTP) databases are examples of typical databases used today. OLTP databases are concerned with the transaction oriented processing of data. On line transaction processing is the process by which data is entered and retrieved from these databases. In these transaction-oriented databases, every transaction is guaranteed. Thus, at a very low level, the OLTP databases are very good at determining whether any specific transaction has occurred.
Another type of database is a data warehouse or datamart. A datamart transforms the raw data from the OLTP databases. The transformation supports queries at a much higher level than the OLTP atomic transaction queries. A data warehouse or a datamart typically provides not only the structure for storing the data extracted from the OLTP databases, but also query analysis and publication tools.
The advantage of datamarts is that users can quickly access data that is important to their business decision making. To meet this goal, datamarts should have the following characteristics. First, datamarts should be consistent in that they give the same results for the same search. The datamart should also be consistent in the use of terms to describe fields in the datamart. For example, “sales” has a specific definition, that when fetched from a database, provides a consistent answer. Datamarts should also be able to separate and combine every possible measure in the business. Many of these issues are discussed in the following book, Ralph Kimball,
The Data Warehouse Toolkit,
John Whiley and Sons, Inc., New York, N.Y. (1996).
Multi-dimensional datamarts are one kind of datamart. Multi-dimensional datamarts rely on a dimension modeling technique to define the schema for the datamart. Dimension modeling involves visualizing the data in the datamart as a multi-dimension data space (e.g., image the data as a cube). Each dimension of that space corresponds to a different way of looking at the data. Each point in the space, defined by the dimensions, contains measurements for a particular combination of dimensions. For example, a three dimensional cube might have product, customer, and territory dimensions. Any point in that cube, defined by those three dimensions, will represent data that relates those three dimensions.
The data in the datamart is organized according to a schema. In a dimensional datamart, the data is typically organized as a star schema. At the center of a standard star schema is a fact table that contains measure data Radiating outward from the fact table, like the points of a star, are multiple dimension tables. Dimension tables contain attribute data, such as the names of customers and territories. The fact table is connected, or joined, to each of the dimension tables, but the dimension tables are connected only to the fact table. This schema differs from that of many conventional relational databases where many tables are joined. The advantage of such a schema is that it supports a top down business approach to the definition of the schema.
Present datamarts have a number of drawbacks that are now discussed. First, datamarts are typically difficult to build and maintain. This is because of the requirements that they be consistent and flexible. A related drawback of present day datamarts is that they do not allow the consultants of the datamart to make changes to the schema simply and easily. Because datamarts support very high level queries about the business processes in the business, they require a great deal of consistency in the use of data from the OLTP systems. Additionally, the datamarts need to be very flexible to address changes in the types of high level queries supported. Changing typical datamarts require the changing of hundreds, or potentially thousands, of lines of SQL code. For example, if a fact column is added to a fact table, the change propagates throughout the datamart. These changes are typically implemented by hand, a very time consuming and error prone process. As a result of the hand coding involved, it is quite possible to construct the database in an arbitrary fashion that does not conform to good rules for constructing datamarts. Thus, well-formed datamarts may not result.
Thus an improved data warehousing technology is desired.
A SUMMARY OF THE INVENTION
One embodiment of the invention includes a method of generating a datamart. The datamart includes tables having rows and columns. The method comprises accessing a description of a schema. The schema defines the relationships between the tables and columns. The description further defines how data is to be manipulated and used to populate the tables in the datamart. That is, the description defines the semantic meaning of the data. The description is further used to create a set of commands to create the tables. The commands are executed causing the creation of the tables. Importantly, when the semantic meaning is associated with the column and rows, programs for manipulating and propagating data into those columns and rows are automatically defined. Previously, consultants would have to hand code the creation, manipulation, and population programs for a datamart. Thus, this embodiment of the invention significantly reduces the amount of work required to create and populate the datamart.
In some embodiments, the semantic meaning of the columns and rows are defined by semantic templates. Each template defines how a particular type of data is populated into the datamart. The template includes program code that can be used to populate the tables and columns in a consistent manner for subsequent querying. Importantly, the specific programs are generated from both the template programs and the schema definition.
In some embodiments of the invention, populating the datamart includes two phases. The first phase includes loading a set of staging tables with data extracted from the source systems. The second phase includes using the semantic templates to convert the extracted data into a format that can be used in the tables.
In some embodiments, the system does not necessarily include the extraction of the data from the source systems, but does include automatically generating a set of commands to convert data, provided by another system, into formats that can be used by the datamart.
In some embodiments, the schema is a star schema having one or more fact tables and one or more dimension tables (or dimensions). The schema can be held in a constellation that includes additional information. The constellation can correspond to a business process.
In other embodiments, the datamart created is object based, therefor rather than tables being created, objects are created.
Although many details have been included in the description and the figures, the invention is defined by the scope of the claims. Only limitations found in those claims apply to the invention.
REFERENCES:
patent: 5386556 (1995-01-01), Hedin et al.
patent: 5550971 (1996-08-01), Brunner et al.
patent: 5659724 (1997-08-01), Borgida et al.
patent: 5675785 (1997-10-01), Hall et al.
patent: 5806060 (1998-09-01), Borgida et al.
patent: 5995958 (1999-11-01), Xu
Kimball, R., “The Data Warehouse Toolkit”, (1996) John-Wiley & Sons, Inc., 388 pages (includes CD ROM).
Chawathe, S. et al., “Change Detection in Hierarchically Structured Information”,SIGMOD Record, vol. 25, No. 2, Jun. 1996, pp. 493-504.
Chawathe, S. et al., “Meaningful Change Detection in Struct
Slater, Jr. Lynn Randolph
Walsh Gregory Vincent
Weissman Craig David
E.piphany, Inc.
Ho Ruay Lian
Marino Fabio E.
Skjerven Morrill & MacPherson LLP
LandOfFree
Method and apparatus for creating and populating a datamart does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and apparatus for creating and populating a datamart, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for creating and populating a datamart will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2477207