Data processing: database and file management or data structures – Database design – Data structure types
Reexamination Certificate
1999-02-16
2001-08-21
Breene, John (Department: 2177)
Data processing: database and file management or data structures
Database design
Data structure types
C707S793000, C707S793000, C707S793000
Reexamination Certificate
active
06278998
ABSTRACT:
FIELD OF THE INVENTION
The present invention relates generally to the field of database methods and systems. More particularly, the present invention relates, in one aspect, to methods and systems for analyzing and mining information stored in database systems. Still more particularly, aspects of the present invention relate to identifying and extracting information exhibiting inter-relationships and temporal cycles.
BACKGROUND OF THE INVENTION
Recent advances in data collection and storage technology have made it possible for many companies to keep large amounts of data relating to their business online. At the same time, low cost computing power has also made enhanced automatic analysis of these data feasible. This activity is commonly referred to as data mining.
One major application domain of data mining is in the analysis of transactional data. In this application database system records include information about user transactions, where each transaction is a collection of items. In this setting, association rules capture inter-relationships between various items. An association rule captures the notion of a set of items occurring together in transactions. For example, in a database maintained by a supermarket, an association rule might be of the form “beer→chips (3%, 87%),” which means that 3% of all database transactions contain the items beer and chips, and 87% of the transactions that have the item “beer” also have the item “chips” in them. The two percentage parameters above are commonly referred to as “support” and “confidence,” respectively. Typically, the data mining process is controlled by a user who sets minimum thresholds for the support and confidence parameters. The user might also impose other restrictions, such as restricting the search space of items, in order to guide the data mining process.
Following the early work in Agrawal, R., T. Imielinski and A. Swami “Mining Association Rules between Sets of Items in Large Databases,”
Proc.
1993
ACM SIGMOD Intl. Conf. on Management of Data
, pp. 207-216, Wash., D.C., May 1993, association rules have been extensively studied. (The last-cited paper will be referred to in the sequel as “Agrawal, et al 93.”) However, this work treats data as one large segment, with no attention paid to segmenting data over different time intervals. For example, returning to our previous example, it may be the case that beer and chips are sold together primarily between 6 PM and, 9 PM. Therefore, if we segment the data over the intervals 7 AM-6 PM and 6 PM-9 PM, we may find that the support for the beer and chips rule jumps to 50%.
Prior data mining systems and methods have failed to provide for identifying, analyzing and reporting time-dependent associated data in an efficient, readily usable manner.
SUMMARY OF THE INVENTION
Limitations of the prior art are overcome and a technical advance is made in accordance with the present invention described in illustrative embodiments herein.
In one aspect, the present invention provides systems and methods for discovering regularities in the behavior of association rules over time. These techniques enable marketers and others to better identify trends in sales and other contexts, and to allow better forecasting of future events, such as user demand for products, services or other resources.
Typically, transactional data to be analyzed are time-stamped and user-divided into disjoint segments corresponding to respective time intervals. In a common arrangement, users opt for “natural” segmentation of data based on months, weeks, days, etc. In any event, users are usually best qualified to make this decision based on their understanding of the underlying data.
In accordance with another aspect of the present invention, we refer to an association rule as cyclic if the rule has the minimum confidence and support at regular time intervals. Such a rule need not hold for the entire transactional database, but rather only for transactional data in a particular periodic time interval. That is, each cyclic rule must have the user specified minimum support and confidence over a specific periodic time interval. The user typically specifies upper and lower bounds for the periods of such time intervals.
We consider mining cyclic association rules as generating all cycles of association rules. Given a large database comprising transactional information, where each transaction consists of a transaction-id, a set of items and a time-stamp, we provide efficient algorithms to discover such cyclic association rules.
In studying the interaction between cycles and association rule mining, we identify techniques for cycle pruning, cycle skipping and cycle elimination that allow us to significantly reduce the amount of wasted work typically expended during data mining processes.
While an extension of existing association rule mining techniques (treating association rules and cycles independently) can be pursued for some applications, preferred embodiments of the present invention generates cyclic association rules using a two-phase technique. In a first phase, cyclic large itemsets are discovered and advantageously reduced using cycle-pruning, cycle-skipping and cycle-elimination techniques. In a second phase cyclic rules are discovered by sequentially processing results from phase one for each user-specified time interval.
Further aspects of inventive embodiments and the effectiveness of described techniques are demonstrated in disclosed examples.
REFERENCES:
patent: 5832482 (1998-11-01), Yu et al.
patent: 5943667 (1999-08-01), Aggarwal et al.
patent: 5946683 (1999-08-01), Rastogi et al.
patent: 5983222 (1999-11-01), Morimoto et al.
patent: 6023571 (2000-02-01), Matsumoto et al.
patent: 6061682 (2000-05-01), Aggarwal et al.
patent: 6092064 (2000-07-01), Aggarwal et al.
patent: 6094645 (2000-07-01), Aggarwal et al.
patent: 6134555 (2000-10-01), Chadha et al.
patent: 6151601 (2000-11-01), Papierniak et al.
patent: 6185549 (2001-02-01), Rastogi et al.
patent: 6189005 (2001-02-01), Chakrabarti et al.
Ozden, B.; Ramaswamy, S.; Silberschatz, A.; Cyclic Association Rules, Feb. 1998, IEEE.
Ozden Banu
Ramaswamy Sridhar
Silberschatz Abraham
Breene John
Le Debbie M
Lucent Technologies - Inc.
Ryan William
LandOfFree
Data mining using cyclic association rules does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Data mining using cyclic association rules, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Data mining using cyclic association rules will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2465511