Method and apparatus for efficiently refreshing sets of...

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S793000, C707S793000, C707S793000, C345S215000, C345S215000, C709S201000

Reexamination Certificate

active

06334128

ABSTRACT:

FIELD OF THE INVENTION
The present invention relates to a method and apparatus for efficiently refreshing sets of summary tables and materialized views in a database management system.
BACKGROUND OF THE INVENTION
In a database management system (DBMS), data is stored in one or more data containers, each container contains records, and the data within each record is organized into one or more fields. In relational database systems, the data containers are referred to as tables, the records are referred to as rows, and the fields are referred to as columns. In object oriented databases, the data containers are referred to as object classes, the records are referred to as objects, and the fields are referred to as attributes. Other database architectures may use other terminology.
The present invention is not limited to any particular type of data container or database architecture. However, for the purpose of explanation, the examples and the terminology used herein shall be that typically associated with relational databases. Thus, the terms “table”, “row” and “column” shall be used herein to refer respectively to the data container, record, and field.
In a database used for “data warehousing” or “decision support”, it is common for identical or closely related queries to be issued frequently. For example, a business may periodically generate reports that summarize the business facts stored in the database, such as: “What have been the best selling brands of soft drinks in each of our sales regions, during the past six months?”.
To respond to such queries, the database server typically has perform numerous joins because the database records that contain the information that is required to respond to the queries are often organized into a star schema. A star schema is distinguished by the presence of one or more relatively large tables and several relatively smaller tables. Rather than duplicating the information contained in the smaller tables, the large tables contain references (foreign key values) to rows stored in the smaller tables. The larger tables within a star schema are referred to as “fact tables”, while the smaller tables are referred to as “dimension tables”.
When a database management system contains very large amounts of data, certain queries against the database can take an unacceptably long time to execute. The cost of executing a query may be particularly significant when the query (which takes the form of a “SELECT” statement in the SQL database language) requires joins among a large number of database tables.
MATERIALIZED VIEWS
Among commercial users of database systems, it has become a common practice to store the results of often-repeated queries in database tables or some other persistent database object. By storing the results of queries, the costly join operations required to generate the results do not have to be performed every time the queries are issued. Rather, the database server responds to the queries by simply retrieving the pre-computed data.
These stored results are commonly referred to as materialized views. The contents of a materialized view is defined by metadata referred to as a view definition. The view definition contains mappings to one or more columns in the one or more tables containing the data. Typically, the view definition is in the form of a database query.
Columns and tables that are mapped to a materialized view are referred to herein as base columns and base tables of the materialized view, respectively. The data maintained in the base columns is referred to herein as base data. The data contained in a materialized view is referred to herein as materialized data.
Materialized views eliminate the overhead associated with gathering and deriving the data every time a query is executed. Computer database systems that are used for data warehousing frequently maintain materialized views that contain pre-computed summary information in order to speed up query processing. Such summary information is created by applying an aggregate function, such as SUM, COUNT, or AVERAGE, to values contained in the base tables. Materialized views that contain pre-computed summary information are referred to herein as “summary tables” or more simply, “summaries”.
Summary tables typically store aggregated information, such as “sum of PRODUCT_SALES, by region, by month”. Other examples of aggregated information include counts of tally totals, minimum values, maximum values, and average calculations.
QUERY REWRITE
Through a process known as query rewrite, a query can be optimized to recognize and use existing materialized views that could answer the query. Typically, the query rewrite optimization is transparent to the application submitting the query. That is, the rewrite operation happens automatically and does not require the application to know about the existence of materialized views, nor that a particular materialized view has been substituted for the original query.
Various query rewrite operations are described in the following U.S. Patent Applications:
U.S. patent application Ser. No. 09/221,641 entitled REWRITING A QUERY IN TERMS OF A SUMMARY BASED ON FUNCTIONAL DEPENDENCIES AND JOIN BACKS filed by Randall Bello, James Finnerty, and Mohamed Ziauddin on Dec. 28, 1998.
U.S. patent application Ser. No. 09/221,363 entitled REWRITING A QUERY IN TERMS OF A SUMMARY BASED ON ONE-TO-ONE LOSSLESSNESS OF JOINS filed by Randall Bello, James Finnerty, and Mohamed Ziauddin on Dec. 28, 1998.
U.S. patent application Ser. No. 09/222,249, entitled REWRITING A QUERY IN TERMS OF A SUMMARY BASED ON AGGREGATE COMPUTABILITY AND CANONICAL FORMAT filed by Jack Raitto, Mohamed Ziauddin, and James Finnerty on Dec. 28,1998 and now U.S. Pat. No. 5,991,754.
The contents of these applications are incorporated herein by this reference.
REFRESHING MATERIALIZED VIEWS
As new data is periodically added to the base tables of a materialized view, the materialized view needs to be updated to reflect the new base data. When a materialized view accurately reflects all of the data currently in its base tables, the materialized view is considered to be “fresh”. Otherwise, the materialized view is considered to be “stale”. A stale materialized view may be recomputed by various techniques that are collectively referred to as “refresh”.
Data loading and refresh of materialized views typically takes place during off-hours when the data warehouse is in a controlled period of little activity. The data loading and refresh is restricted to a time period called the refresh window during which the system can be dedicated to refresh. The refresh window is typically allowed to be no more than four to six hours. Refresh may be deferred until the end of the week or month, so that loading of additional detail data may occur much more frequently than refresh.
One approach to refreshing materialized views is referred to as the “total refresh” or “full refresh” approach. According to the total refresh approach, the values in materialized views are recalculated based on all of the base data every time new base data is supplied. Systems that employ full refresh approach have the disadvantage that the recreation process is a relatively lengthy operation due to the size and number of tables from which the materialized data is derived. For example, when ten new rows are added to a particular base table that contains a million rows, a total refresh operation would have to process all one million and ten rows of the base table to regenerate the materialized views derived using the base table.
The process of updating materialized data may be improved by performing incremental refresh, where rather than generating a new set of materialized data based on calculations that use all of the base data, the materialized data is updated based on just the new base data.
Prior mechanisms for incremental refresh have relied upon the re-computing of materialized data as data is being loaded into base tables. Thus, as one set of rows is added to a base table, the materialized view is updated based on the data in the

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and apparatus for efficiently refreshing sets of... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method and apparatus for efficiently refreshing sets of..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for efficiently refreshing sets of... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2571872

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.