Data processing: artificial intelligence – Neural network – Learning task
Reexamination Certificate
1999-08-02
2002-07-16
Davis, George B. (Department: 2121)
Data processing: artificial intelligence
Neural network
Learning task
C706S025000, C706S026000
Reexamination Certificate
active
06421654
ABSTRACT:
TECHNICAL DOMAIN
This invention relates to a learning process generating neural networks designed for sorting data and built up as a function of the needs of the task to be carried out.
Applications of this invention are in domains making use of neural networks and particularly for medical diagnosis, recognition of shapes or sorting of objects or data such as spectra, signals, etc.
STATE OF THE ART
Neural networks, or neuron networks, are systems that carry out calculations on digital information, inspired from the behavior of physiological neurons. Therefore, a neural network must learn to carry out tasks that will be required of it subsequently. This is done using an examples base, or a learning base, that contains a series of known examples used to teach the neural network to carry out the tasks that it is to reproduce subsequently with unknown information.
A neural network is composed of a set of formal neurons. Each neuron is a calculation unit composed of at least one input and one output corresponding to a state. At each instant, the state of each neuron is communicated to the other neurons in the set. Neurons are connected to each other by connections, each of which has a synaptic weight.
The total neuron activation level is equal to the sum of the states of its input neurons, each weighted by the synaptic weight of the corresponding connection. At each instant, this activation level is used by the neuron to update its output.
In particular, the neural network may be a layered network; in this case, the network comprises one or several layers of neurons each connected to the previous layer, the last layer being the output layer. Each configuration of network inputs produces an internal representation that is used by subsequent layers, for example to sort the input state.
At the present time there are several learning algorithms used to sort data or for shape recognition starting from neural networks, as explained for example in the document entitled “Introduction to the theory of neural computation”, by HERTZ, KROGH and PALMER (1991), Adison-Wesley.
Conventionally, the algorithms are used to fix network parameters, namely values of connection weights, the number of neurons used in the network, etc., using a number of known examples of data to be sorted. These examples make up the learning base.
The most frequently used algorithm is the gradient backpropagation algorithm. This type of algorithm is described for example in the document entitled “A learning scheme for asymmetric threshold networks”—in Cognitiva, Y. LE CUN, CESTA-AFCET Ed., pages 559-604 (1985). This algorithm consists of minimizing a cost function associated with the network output quality. However this algorithm needs neurons, the states of which are represented by real numbers, even for typically binary problems such as the problem of sorting into two classes. Gradient backpropagation also requires that the number of neurons to be used is input beforehand; however there is no theoretical criterion to guide the expert in the field in determining this number of necessary neurons.
Other algorithms, called “constructivist” or “adaptive” algorithms, adapt the number of neurons in the network as a function of the task to be carried out. Furthermore, some of these algorithms only use binary neurons, as described for example in the document entitled “Learning in feed forward layered neural networks: the tiling algorithm”, MEZARD and NADAL, J. PHYS. A
22
, pages 2 191-2 203, and in the document entitled “Learning by activating neurons: a new approach to learning in neural networks”, RUJAN and MARCHAND, complex Systems 3, (1989), page 229. The main disadvantage of these algorithms is that the learning rule at each neuron in the network is not efficient, such that the resulting networks are too large for the task to be solved and are not easily generalized.
The performance of an algorithm is measured by its generalization capacity, in other words its capacity to predict the class to which data that is not in the learning base belongs. In practice, it is measured using the “generalization error” which is the percentage of data in a test base (containing known examples and independent of the learning base) sorted incorrectly by the network for which the parameters were determined by the learning algorithm. One efficient learning algorithm for a neuron is described in the document entitled “Learning with temperature dependent algorithm” by GORDON and GREMPEL, Europhysics Letters, No. 29, pages 257 to 262, January 1995, and in the document entitled “Minimerror: perceptron learning rule that finds the optimal weights” by GORDON and BERCHIER, ESANN'93, Brussels, M. VERLEYEN Ed., pages 105-110. These documents describe an algorithm called the “Minimerror” that can learn sorting tasks by means of binary neurons. This algorithm has the advantage that its convergence is guaranteed and that it has good digital performances, in other words an optimum generalization capacity.
Furthermore, this Minimerror algorithm has been associated with another constructivist type learning rule called “Monoplane”. This Monoplane algorithm is described in the article entitled “An evolutive architecture coupled with optimal perceptron learning for classification” by TORRES-MORENO, PERETTO and GORDON, in Esann'95, European symposium on artificial neural networks, Brussels, April 1995, pages 365 to 370. This Monoplane algorithm combines the Minimerror algorithm with a method of generating internal representations.
This Monoplane algorithm is of the incremental type, in other words it can be used to build a neural network with a hidden layer, by adding neurons as necessary.
Its performances are thus better than the performances of all the other algorithms, which means that for an identical examples base used by a series of different algorithms, the results achieved using this Monoplane algorithm are better, in other words they have a lower generalization error.
However, these Minimerror and Monoplane algorithms use neurons with a sigmoidal activation function capable only of carrying out linear separations. A network produced from this type of neuron can therefore only set up plane boundaries (composed of hyperplanes) between domains in different classes. Therefore, when the boundaries between classes are curved, these networks have a linear approximation per piece, which introduces a certain amount of inaccuracy and the need to input a large number of neurons.
However other types of algorithms are capable of paving the space with hyperspheres, each of which is represented by a neuron. For example, these algorithms are described in COOPER, REILLY & ELBAUM (1988), Neural networks systems, an introduction for managers, decision-makers and strategists, NESTOR Inc. Providence, R.I. 02906-USA, and usually end up with a very large number of neurons even when they include pruning operations. Therefore these algorithms use too many resources, in other words too many neurons and too much weight to carry out the task.
DESCRIPTION OF THE INVENTION
The purpose of the invention is to overcome the disadvantages of the techniques described above. It does this by proposing a learning process generating small neural networks built according to the needs of the task to be carried out and with excellent generalization. This learning process is intended for sorting of objects or data into two classes separated by separation surfaces which may be quadratic, or linear and quadratic.
More precisely, the invention relates to a process for learning from an examples base composed of known input data and targets corresponding to the class of each of these input data, to sort objects into two distinct classes separated by at least one quadratic type or quadratic and linear type separating surface, this process consisting of generating a network of binary type neurons, each comprising parameters describing the separating surface that they determine, this neural network comprising network inputs and a layer of hidden neurons connected to these inputs and to a netwo
Burns Doane , Swecker, Mathis LLP
Commissariat a l'Energie Atomique
Davis George B.
LandOfFree
Learning method generating small size neurons for data... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Learning method generating small size neurons for data..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Learning method generating small size neurons for data... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2878383