Real time sessions in an analytic application

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S793000, C707S793000, C707S793000, C707S793000, C707S793000

Reexamination Certificate

active

06789096

ABSTRACT:

FIELD OF THE INVENTION
The present invention relates to database management systems. More specifically, the present invention pertains to a method for real time processing of a dynamically increasing computer database used in an analytic application.
BACKGROUND OF THE INVENTION
Computers are used to perform a wide variety of applications in such diverse fields as finance, traditional and electronic commercial transactions, manufacturing, health care, telecommunications, etc. Most of these applications typically involve inputting or electronically receiving data, processing the data according to a computer program, then storing the results in a database, and perhaps transmitting the processed data to another application, messaging system, or user in a computer network. As computers became more powerful, faster, and more versatile, the amount of data that could be processed correspondingly increased.
Furthermore, the expanding use of “messaging systems” enhances the capacity of networks to transmit current operational data and to provide interoperability between disparate database systems. Messaging systems are computer systems that allow logical elements of diverse applications to seamlessly link with one another. Messaging systems also provide for the delivery of data across a broad range of hardware and software platforms and allow applications to interoperate across network links despite differences in underlying communications protocols, system architectures, operating systems, and databases services.
Prior Art
FIG. 1
illustrates the characteristics of the various environments in which data processing can occur. The types of environments are characterized according to whether they operate on a batch basis or on a transactional basis (that is, whether data are operated on in bulk, or handled in smaller quantities such as a per transaction basis). The types of environments are also characterized according to whether the data need to be operated on in real time (e.g., essentially right away) or whether some latency in the processing can be tolerated.
Prior Art
FIG. 1
shows ETL (extraction/transformation/loading) space
1
, EAI (enterprise application and integration) space
2
, B
2
B (business-to-business) space
3
, and process integration space
4
. ETL space
1
is characterized by large amounts of data handled in bulk, with some degree of latency occurring between the time data are received and the time processing of the data is completed. EAI space
2
is characterized by smaller amounts of data handled essentially in real time. B
2
B space
3
is characterized as handling larger amounts of data than that of EAI space
2
in essentially real time. However, the amount of data handled in B
2
B space
3
is generally not as large as that handled in ETL space
1
. Process integration space
4
primarily deals with the integration of business processes handling smaller amounts of data with some degree of associated latency. Of particular interest to the discussion herein are ETL space
1
and EAI space
2
.
In ETL space
1
, large amounts of data exist in operational databases. The raw data found in the operational databases often exist as rows and columns of numbers and codes which, when viewed by individuals, may appear bewildering and incomprehensible. Furthermore, the scope and vastness of the raw data stored in modern databases can be overwhelming. Hence, analytic applications were developed in an effort to help interpret, analyze, and compile the data so that it may be readily and easily understood. This is accomplished by transforming (e.g., sifting, sorting, and summarizing) the raw data before it is presented for display, storage, or transmission. The transformed data are loaded into target databases in a data warehouse or data mart. Individuals can access the target databases, interpret the transformed data, and make key decisions based thereon.
An example of the type of company that would use data warehousing is an online Internet bookseller having millions of customers located worldwide whose book preferences and purchases are tracked. By processing and warehousing this data, top executives of the bookseller can access the processed data from the data warehouse, which can be use to make sophisticated analysis and key decisions on how to better serve the preferences of their customers throughout the world.
One problem generally associated with transforming data for a data mart or data warehouse is that, because of the huge amounts of data to be processed, it can take a long time to perform. For the purpose of efficient utilization of computer resources, the transformation of data is normally conducted in a “batch” mode. Operational data are collected for a period of time and then extracted, transformed, and loaded into data warehouses/marts by the analytic application.
For example, sales data may be collected in the operational database for an entire week, processed by the database application in one continuous session over the weekend, and then aggregated into a target database stored in the data warehouse. The target database may reflect, for example, summary year-to-date sales by geographic region. The data warehouse storing the year-to-date sales data is updated only when all individual data accumulated for the previous week have been extracted and transformed. Between updates or even during an update session, end-users accessing the data warehouse will be presented with data from the target database current only to the previous week's update. Data accumulating for the next session's processing batch will not be reflected in the target database.
Thus, the batch mode of operation for processing data in ETL space
1
of Prior Art
FIG. 1
can be problematic because of the latency between the time raw data are received and the time at which transformed data are ready for evaluation by end-users. The latency issue is compounded as large amounts of new operational (raw) data are frequently received for input into the data mart or data warehouse, in particular with the advent of messaging systems. However, the new data are not considered until the next time the target databases are updated.
In EAI space
2
, data are more transactional in nature and thus the quantities of data requiring processing are smaller than quantities of data processed in ETL space
1
. Accordingly, in EAI space
2
, data can be processed essentially in real time (in essence, as the transaction occurs).
The boundaries between ETL space
1
and EAI space
2
are blurring, as end-users indicate their preference for processing large amounts of data (as in ETL space
1
) with real time speed (as in EAI space
2
). In addition, some applications driven from a data warehouse require constant and frequent updates of the data warehouse. To satisfy these objectives, it is becoming more common to shorten the period of time between target database updates in ETL space
1
. That is, update sessions in the batch mode are run on a more frequent basis in an attempt to simulate real time processing.
However, there is a large computational cost associated with running update sessions more frequently in the batch mode. To launch a session, data transformation pipelines generally need to be established, caches and other data structures need to be built, and relevant data need to be identified, retrieved and used to prime (initialize) the data transformation pipelines and to populate the caches and other data structures. These tasks can consume a portion of the user's time, and also they can consume a measurable portion of a computer system's available resources. The difficulty of simulating real time processing is increased by the need to complete these tasks within a short period of time. In essence, an update session must be initiated and executed within a time window that has been specified to be small enough to simulate real time processing.
Another problem with running updates sessions more frequently is that, although in some aspects it may appear to simulate real time, in actuality

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Real time sessions in an analytic application does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Real time sessions in an analytic application, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Real time sessions in an analytic application will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3195125

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.