Data mining method and system for generating a decision tree cla

Patent

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

395611, 395612, G06F 1730

Patent

active

057872740

ABSTRACT:
A method and apparatus are disclosed for generating a decision tree classifier from a training set of records. The method comprises the steps of: pre-sorting the records based on each numeric record attribute, creating a decision tree breadth-first, and pruning the tree based on the MDL principle. Preferably, the pre-sorting includes generating a class list and attribute lists, and independently sorting the numeric attribute lists. The growing of the tree includes evaluating possible splitting criteria and selecting a splitting test for each leaf node, based on a splitting index, and updating the class list to reflect new leaf nodes. In a preferred embodiment, the splitting index is a gini index. The pruning preferably includes encoding the decision tree and splitting tests in an MDL-based code, and determining whether to convert a node into a leaf node, prune its child nodes, or leave the node intact, based on the code length of the node.

REFERENCES:
patent: 4719571 (1988-01-01), Rissanen et al.
patent: 5490060 (1996-02-01), Malec et al.
Agrawal et al., "Quest: A project on Database Mining", SIGMOD Conference http://sunsite.ust.hkdblp/db/conf/sigmod/sigmod94-514.html, pp.1-2, 1994.
Shani et al., Fundamentals Of Data Structures, chapter 6 section 6.2, pp. 292-301, 1983.
R. Agrawal et al., An Interval Classifier for Database Mining Applications, Proceedings of the 18th VLDB Conference Vancouver, British Columbia, Aug. 1992.
R. Agrawal et al., Database Mining: A Performance Perspective, IEEE Transactions on Knowledge and Data Engineering, vol. 5, No. 6, pp. 914-925, Special Issue on Learning and Discovery in Knowledge-Based Databases, Dec. 1993.
L. Breiman (Univ. of CA-Berkeley) et al. Classification and Regression Trees (Book) Chapter 2. Introduction to Tree Classification pp. 18-58, Wadsworth International Group, Belmont, CA 1984.
J. Catlett, Megainduction: Machine Learning on Very Large Databases, PhD thesis, Univ. of Syndey, Jun./Dec. 1991.
P. K. Chan et al., Experiments on Multistrategy Learning by Meta-learning. In Proc. Second Intl. Conf. on Info. and Knowledge Mgmt., pp. 314-323, 1993.
S. J. Hong, R-MINI: A heuristic Algorithm for Generating Minimal Rules from Examples. In 3rd Pacific Rim Int'l Conference on Artificial Intelligence, pp. 331-337, Aug. 1994.
U. Fayyad et al., The Attribute, Selection Problem in Decision Tree Generation. In 105h Nat'l Conf. on AI AAAI-92, Learning: Inductive 1992.
M. James, Classification Algorithms (book), Chapters 1-3, QA278.65, J281 Wiley-Interscience Pub., 1985.
M. Mehta et al., Mdl-based Decision Tree Pruning. Int'l Conference on Knowledge Discovery in Databases and Data Mining (KDD-95) Montreal, Canada, pp. 216-221, Aug. 1995.
J. R. Quinlan et al., Inferring Decision Trees Using Minimum Description Length Principle, Information and Computation 80, pp. 227-248, 1989. (0890-5401/89 Academic Press, Inc.).
Wallace et al., Coding Decision Trees, Machine Learning, 11, pp. 7-22, 1993. (Kluwer Academic Pub., Boston. Mfg. in the Netherlands.).
S. M. Weiss et al., Computer Systems that Learn, Classification and Prediction Methods from Statistics, Neural Nets, Machine Learning, and Expert Systems, pp. 113-143, 1991. Q325.5, W432, C2, Morgan Kaufman Pub. Inc., San Mateo, CA.
S. Murthy et al., Decision Tree Induction: How Effective is the Greed Heuristic? Int'l Conference on Knowledge Discovery in Databases and Data Mining (KDD 95) Montreal, Canada, pp. 222-227.
No. 08/436,794, filed Apr. 14, 1994, for System and Method for Mining Generalized Association Rules in Databases U.S. Pat. No. 5,615,341 (presently unavailable to view).

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Data mining method and system for generating a decision tree cla does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Data mining method and system for generating a decision tree cla, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Data mining method and system for generating a decision tree cla will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-31944

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.