System and method of using clustering to find personalized...

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C702S002000, C702S003000, C702S007000

Reexamination Certificate

active

06408295

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates generally to systems and methods for generating association rules for describing relationships among items in a database, and particularly, to a system and method implementing skew of data included in a database of sales transactions for determining personalized association rules.
2. Discussion of the Prior Art
Association rules are generated to find the relationships between different items in a database of transactions, e.g., a sales transaction. A sales transaction is a set of items purchased by a given consumer at one time. Such rules track the buying patterns of consumers, e.g., finding how the presence of one item in a transaction affects the presence of another and so forth. The problem of association rule generation has recently gained considerable prominence in the data mining community because of its potential as an important tool for knowledge discovery.
Given I={i1, i
2
, . . . , im} as a set of binary literals called items, each transaction T is a set of items, such that T is a subset of I. This corresponds to the set of items which a consumer may buy in a basket transaction. An association rule is a condition of the form X==>Y where X and Y are two sets of items. The idea of an association rule is to develop a systematic method by which a user may infer the presence of some sets of items, given the presence of other items in a transaction. Such information is useful in making decisions such as customer targeting, shelving, and sales promotions.
An important approach to the association rule problem was developed by Agrawal, et al., such as described in the reference by Agrawal R., Imielinski T., and Swami A., entitled “Mining Association Rules Between Sets of Items in Very Large Databases,” Proceedings of the ACM SIGMOD Conference on Management of Data, pages 207,216, 1993 (Agrawal et al.). As described, the term SUPPORT of a rule X==>Y is defined as the fraction of transactions which contain both X and Y. The CONFIDENCE of a rule X==>Y is the fraction of transactions containing X, which also contain Y. Thus, if a rule has 90% confidence, then it means that 90% of the tuples containing X also contain Y. The approach taken by Agrawal et al. is a two-phase large itemset approach implemented as follows: 1) the first step is to generate all combinations of items that have fractional transaction support above a certain user-defined threshold called MINSUPPORT; these combinations are herein referred to as LARGE ITEMSETS. Given an itemset X={i1, i2, . . . , ik}, it may be used to generate at most k rules of the type [S−{ ir}]==>ir for each r in {1, . . . k}. Once these rules have been generated, only those rules above a certain user defined threshold called MINCONFIDENCE may be retained. The most computationally intensive part of the association rule problem is that of finding large itemsets. The second step of actually generating the rules is relatively straightforward.
Initially, the method was proposed only for the case of transaction data however, further research has been devoted to speeding up the algorithm and extending the approach to other scenarios such as described in the following references: Agrawal et al. R., Imielinski T., and Swami A., “Mining Association Rules Between Sets of Items in Very Large Databases,”Proceedings of the ACM SIGMOD Conference on Management of Data, pages 207,216, 1993; Agrawal R., Mannila H., Srikant, R., Toivonen H., and Verkamo A. I., “Fast Discovery of Association Rules”, Advances in Knowledge Discovery and Data Mining, AAAI/MIT Press, Chapter 12, pages 307-328, and, Proceedings of the 20th International Conference on Very Large Data Bases, pages 487-499, 1994; Brin S., Motwani R., Ullman J. D., and Tsur S., “Dynamic Itemset Counting and implication rules for Market Basket Data”, Proceedings of the ACM SIGMOD, 1997, pages 255-264; Han J. And Fu Y., “Discovery of Multi-level Association Rules From Large Databases”, Proceedings of the International Conference on Very Large Databases, pages 420-431, Zurich, Switzerland, September 1995; Lent B., Swami A., and Widom J., “Clustering Association Rules”, Proceedings of the Thirteenth International Conference on Data Engineering, pages 220-231, Birmingham, U.K., April 1997; Mannila H., Toivonen H., and Verkamo A. I., “Efficient Algorithms for Discovering Association Rules”, AAAI Workshop on Knowledge Discovery in Databases, 1994, pages 181-192; Park J. S., Chen M. S., and Yu, P. S., “An Effective Hash-based Algorithm for Mining Association Rules”, Proceedings of the ACM SIGMOD Conference on Management of Data, 1995; Savasere A., Omiecinski E., and Navathe S. B., “An Efficient Algorithm for Mining Association Rules in Large Databases”, Proceedings of the 21st International Conference on Very Large Databases, 1995; Srikant R., and Agrawal R., “Mining Generalized Associate Rules”, Proceedings of the 21st International Conference on Very Large Data Bases, 1995, pages 407-419; Srikant R., and Agrawal R., “Mining Quantitative Association Rules in Large Relational Tables”, Proceedings of the ACM SIGMOD Conference on Management of Data, 1996, pages 1-12; and, Toivonen H., “Sampling Large Databases for Association Rules”, Proceedings of the 22nd International Conference on Very Large Databases, Bombay, India, September 1996.
Another area of research to which this invention is related is referred to as clustering. The problem of clustering is that of segmenting the data into groups of similar objects. The problem of finding clusters in high dimensional data has been discussed in the following references: R. Agrawal, J. Gehrke, D. Gunopolos and P. Raghavan, “Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications”, Proceedings of the ACM SIGMOD International Conference on Management of Data, Seattle, Wash., 1998; M. Ester, H. -P. Kriegel and X. Xu, “A Database Interface for Clustering in Large Spatial Databases,” Proceedings of the First International Conference on Knowledge Discovery and Data Mining, 1995; M. Ester, H. -P. Kriegel, J. Sander and X. Xu, “A Density Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise”, Proceedings of the 2nd International Conference on Knowledge Discovery in Databases and Data Mining, Portland, Ore., August 1996; R. Kohavi and D. Sommerfield, “Feature Subset Selection Using the Wrapper Method: Overfitting and Dynamic Search Space Topology”, Proceedings of the First International Conference on Knowledge Discovery and Data Mining, 1995; S. Guha, R. Rastogi and K. Shim, “CURE: An Efficient Clustering Algorithm for Large Databases”, Proceedings of the 1998 ACM SIGMOD Conference, pages 73-84, 1998; R. Ng and J. Han, “Efficient and Effective Clustering Methods for Spatial Data Mining”, Proceedings. of the 20th International Conference on Very Large Data Bases, Santiago, Chile, 1994, pages 144-155; and, T. Zhang, R. Ramakrishnan and M. Livny, “BIRCH: An Efficient Data Clustering Method for Very Large Databases”, Proceedings of the ACM SIGMOD International Conference on Management of Data, Montreal, Canada, June 1996.
The clustering and data segmentation techniques described in the prior art have heretofore never been applied for the purpose of generating personal association rules for customers.
Thus, it would be highly desirable to provide a system and method for finding personalized association rules by segmenting the data into groups of similar records, and using this segmentation in order to find the personalized rules. The motivation in finding personalized association rules is that e-commerce merchants are able to track buying behavior of customers using the online sales transaction data. This data may be used to determine association rules which are specific to each individual customer and thus, may be used as a tool for performing target marketing for that customer.
SUMMARY OF THE INVENTION
The

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

System and method of using clustering to find personalized... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with System and method of using clustering to find personalized..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System and method of using clustering to find personalized... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2977905

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.