Data processing: artificial intelligence – Neural network
Reexamination Certificate
1998-11-19
2002-09-10
Starks, Jr., Wilbert L. (Department: 2122)
Data processing: artificial intelligence
Neural network
C706S021000
Reexamination Certificate
active
06449603
ABSTRACT:
FIELD OF THE INVENTION
This invention relates generally to artificial intelligence and machine learning. More particularly, this invention relates to a system and method for combining learning agents that derive prediction methods from a training set of data, such as neural networks, genetic algorithms and decision trees, so as to produce a more accurate prediction method.
BACKGROUND OF THE INVENTION
Prior machine learning systems typically include a database of information known as a training data set from which a learning agent derives a prediction method for solving a problem of interest. This prediction method is then used to predict an event related to the information in the training set. For example, the training set may consist of past information about the weather, and the learning agent may derive a method of forecasting the weather based on the past information. The training set consists of data examples which are defined by features, values for these features, and results. Continuing the example of weather forecasting, an example may includes features such as barometric pressure, temperature and precipitation with their corresponding recorded values (mm of mercury, degrees, rain or not). The resulting weather is also included in the example (e.g., it did or did not rain the next day). The learning agent includes a learning method, a set of parameters for implementing the method, and an input representation that determines how the training data's features will be considered by the method. Typical learning methods include statistical/Bayesian inference, decision-tree induction, neural networks, and genetic algorithms. Each learning method has a set of parameters for which values are chosen for a particular implementation of the method, such as the number of hidden nodes in a neural network. Similarly, each application of a learning method must specify the representation, that is, the features to be considered; for example, the semantics of the neural network's input nodes.
One clear lesson of machine learning research is that problem representation is crucial to the success of all learning methods (see, e.g. Dietterich, T., “Limitations on Inductive Learning,”
Proceedings of Sixth International Workshop on Machine Learning
(pp. 125-128), Ithaca, N.Y.: Morgan Kaufman (1989); Rendell, L., & Cho, H., “Empirical Learning as a Function of Concept Character,”
Machine Learning,
5(3), 267-298 (1990); Rendell L., & Ragavan, H., “Improving the Design of Induction Methods by Analyzing Algorithm Functionality and Data-based Concept Complexity,”
Proceedings of UCAI,
(pp. 952-958), Chambery, France (1993), which are all hereby incorporated by reference). However, it is generally the case that the choice of problem representation is a task done by a human experimenter, rather than by an automated machine learning system. Also significant in the generalization performance of machine learning systems is the selection of the learning method's parameter values, which is also a task generally accomplished by human “learning engineers” rather than by automated systems themselves.
The effectiveness of input representations and free parameter values are mutually dependent. For example, the appropriate number of hidden nodes for an artificial neural network depends crucially on the number and semantics of the input nodes. Yet up to now, no effective method has been developed for simultaneously searching the spaces of representations and parameter values of a learning agent for the optimum choices, thereby improving the accuracy of its prediction method.
Machine learning systems have also conventionally used a single learning agent for producing a prediction method. Until now, no effective method has been developed that utilizes the interaction of multiple learning agents to produce a more accurate prediction method.
An objective of the invention, therefore, is to provide a method and system that optimizes the selections of parameter values and input representations for a learning agent. Another objective of the invention is to provide a simple yet effective way of synergistically combining multiple learning agents in an integrated and extensible framework to produce a more accurate prediction method. With multiple and diverse learning agents sharing their output, the system is able to generate and exploit synergies between the learning methods and achieve results that can be superior to any of the individual methods acting alone.
SUMMARY OF THE INVENTION
A method of producing a more accurate prediction method for a problem in accordance with the invention comprises the following steps. Training data is provided that is related to a problem for which a prediction method is sought, the training data initially represented as a set of primitive features and their values. At least two learning agents are also provided, the agents including input representations that use the primitive features of the training data. The method then trains the learning agents on the data set, each agent producing in response to the data a prediction method based on the agent's input. Feature combinations are extracted from the prediction methods produced by the learning agents. The input representations of the learning agents are then modified by including feature combinations extracted from another learning agent. The learning agents are again trained on the augmented training data to cause a learning agent to produce another prediction method based on the agent's modified input representation.
In another method of the invention, the parameter values of the learning agents are changed to improve the accuracy of the prediction method. The method includes determining a fitness measure for each learning agent based on the quality of the prediction method the agent produces. Parameter values of a learning agent are then selected based on the agent's fitness measure. Variation is introduced into the selected parameter values, and another learning agent is defined using the varied parameter values. The learning agents are again trained on the data set to cause a learning agent to produce a prediction method based on the varied parameter values.
The two methods may be used separately or in combination. Results indicate a synergistic interaction of learning agents when both methods are combined, which provides yet a more accurate prediction method.
The foregoing and other objects, features, and advantages of the invention will become more apparent from the following detailed description of a preferred embodiment which proceeds with reference to the accompanying drawings.
REFERENCES:
patent: 5199077 (1993-03-01), Wilcox et al.
David R. H. Miller, Tim Leek and Richard M. Schwartz; A hidden Markov model information retrieval system, Proceedings on the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, Jan. 1999, pp. 214-221.*
Jian-Yun Nie, Martin Brisebois and Xiaobo Ren; On Chinese text retrieval, Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval, Jan. 1996, pp. 225-233.*
Fei Song and W. Bruce Croft, A general language model for information retrieval Proceedings on the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, Jan. 1999, pp. 316-321.*
Zhaohui Zhang; Yuanhui Zhou; Yuchang Lu; Bo Zhang, Extracting rules from a GA-pruned neural network, Systems, Man and Cybernetics, 1996., IEEE International Conference on, vol. 3, Oct. 14-17, 1996, pp. 1682-1685 vol. 3.*
Brasil, L.M.; De Azevedo, F.M.; Barreto, J.M.; Noirhomme-Fraiture, M., A neuro-fuzzy-GA system architecture for helping the knowledge acquistion process, Intelligence and Systems, May 21-23, 1998. Proceedings., IEEE International Joint Symposia on, 1998.*
Fukumi, M.; Akamatsu, N., Rule extraction from neural networks trained using evolutionary algorithms with deterministic mutation, Neural Networks Proceedings, 1998. IEEE World Congress on Computational Intelligenc
Klarquist & Sparkman, LLP
Starks, Jr. Wilbert L.
The United States of America as represented by the Secretary of
LandOfFree
System and method for combining multiple learning agents to... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with System and method for combining multiple learning agents to..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System and method for combining multiple learning agents to... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2861611