Data processing: artificial intelligence – Neural network
Reexamination Certificate
1997-03-17
2001-04-03
Hafiz, Tariq R. (Department: 2762)
Data processing: artificial intelligence
Neural network
C706S016000
Reexamination Certificate
active
06212508
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates generally to a process for the neural modeling of dynamic processes having very different time constants.
2. Description of the Related Art
Neural networks are employed in the most diverse of technical fields. Neural networks have proven to be particularly suitable anywhere where decisions are to be derived from complex technical relationships and from insufficient information. To form one or more output variables, for example, one or more input variables are fed to the neural network. For this purpose, such a network is initially trained for the specific use and is subsequently generalized and is then validated using a data set other than the training data. Neural networks prove to be particularly suitable for many uses, since they are universally trainable.
A problem which often occurs in conjunction with the use of neural networks is, however, that the number of inputs of the neural network is often too large and hence the network appears to be unnecessarily complex for the application. In particular, the excessively large neural network does not reach the required performance on the generalization data during training. It then often learns application examples by heart rather than learning the problem structure. In practice, it is therefore desirable to limit the number of possible input variables as far as possible to those that are necessary, that is to say to the number of input variables which have the greatest effects on the output variables that are to be determined. The problem may also arise in practice that a neural network is intended to be supplied with input variables which arise at different times, some of which are located hours or days apart. For such eventualities, for example, recurrent neural networks are used. These networks contain feedback paths between the neurons internally and it is thus possible for them to construct a type of memory about the input variables which have arisen. However, because of the simpler handling, in particular because they are more easily trainable, forwardly directed neural networks often appear desirable in the field of use.
In the case of industrial processes, in particular in the case of biochemical processes, different partial processes having very different time constants often interact. Chemical reactions often take place in a fraction of a second. During the degradation or synthesis of materials by microorganisms and the growth and death of bacteria or fungi, time constants of hours or days often occur. Time constants in the range of hours and days occur in particular in systems in which there are material circulations having feedback and intermediate storage. Separate treatment of the partial processes, which progress at different speeds, is often not possible. Thus, for example, there is a close coupling between the individual processes proceeding in the purification of sewage. In addition, measured values “between” the individual processes can be obtained only at very high cost, if at all, as a precondition for separate neural modeling. This is true in particular in the case of the biochemical processes which proceed in sewage treatment in sewage treatment plants.
A suitable representation of the input data, in conjunction with the selection of relevant process information by means of the neural network, is the precondition for being able to model the simultaneous action of different time constants neurally.
In order to be able to model very rapidly progressing partial processes neurally, on the one hand it is necessary for data to be obtained at a very high sampling rate and to be applied to the neural network as input variables. On the other hand, for modeling the slowly progressing processes, the data over a range reaching appropriately far back into the past are to be applied as input values to a forwardly directed network. This method of proceeding has the disadvantage that the neural network has a large number of inputs even with just a small number of measured variables and hence has a large quantity of adaptable parameters. As a result of this high number of free parameters, the network has a complexity which is higher than is appropriate for the data, and tends to “overfitting”, see the references Hergert, F., Finnoff, W., Zimmermann, H. G.: “A comparison of weight elimination methods for reducing complexity in neural networks”, in Proceedings of the Int. Joint Conf. on neural networks, Baltimore, 1992, and Hergert, F., Finnoff, W., Zimmermann, H. G.: “Evaluation of Pruning Techniques”, ESPRIT Projekt Bericht [Project Report] 5293 - Galatea, Doc. No.: S23.M12.-August 1992. Thus, in the case of the data points used for training, the neural network indeed reaches a very good approximation to the data. In the case of the “generalization data” not used for training, networks having too many adaptable parameters exhibit poor performance. An alternative possibility for the neural modeling of processes having very different time constants are recurrent neural networks (RNN). Because of the feedback which is realized in the network, RNN are capable of storing information from previous data and thus of modeling processes having long time constants or having feedback. The disadvantage of RNN is that simple learning processes such as, for example, back propagation, can no longer be used and, instead, specific learning processes such as, for example, Real Time Recurrent Learning (RTRL) or Back Propagation Through Time (BPTT) must be employed. Especially in the case of a high number of data points, RNN are difficult to train and tend to numeric instability, see the reference Sterzing, V., Schirmann, B.: “Recurrent Neural Networks for Temporal Learning of Time Series”, in Proceedings of the 1993 Internation Conference on Neural Networks, March 27-31, San Francisco 843-850.
The international potent publication WO-A-94/17489 discloses a Back Propagation network which defines preprocessing parameters and time delays in the training mode. In the operating mode, the operating data are then processed together with the preprocessing parameters and, together with the defined time delays, are fed into the system as input data. Such a network is particularly suitable for applications in which the input data are based on different time scales.
The journal article IEEE EXPERT, Volume 8, No. 2, Apr. 1, 1993, pages 44 to 53, by J. A. Leonard and M. A. Kramer: “Diagnosing dynamic faults using modular neural nets” discloses general possibilities of the diagnosis of dynamic errors in modular networks. In that publication, for example, the time aspect of input data is taken into account in that each time a new input data set is added, the oldest data set is dispensed with and the system then carries out a new calculation. By this means, the number of input data sets to be taken into account by the system in each case is kept within limits. In the prior art cited, therefore, limited possibilities are indicated for taking into account the past of the system. Furthermore, their different techniques are extensively explained. No further relevant prior art is known.
An object on which the invention is based is to provide an arrangement and a process with which the number of input variables arising over time of a neural network can be reduced in a practical manner. It is intended, in particular, by means of the inventive process and the inventive arrangement to realize a memory, on remembrance for forwardly directed neural networks. Furthermore, it is intended that the inventive process shall meet the special requirements which underlie chemical processes having different time constants.
This and other objects and advantages are achieved in a process for conditioning an input variable of a neural network,
a) in which a time series is formed from a set of values of the input variable by determining the input variable at discrete times,
b) in which, from the time series, at least a first interval is delimited in such a way that the length of the interv
Maschlanka Jörg
Sterzing Volkmar
Tresp Volker
Hafiz Tariq R.
Schiff & Hardin & Waite
Siemens Aktiengesellschaft
Starks, Jr. Wilbert L.
LandOfFree
Process and arrangement for conditioning an input variable... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Process and arrangement for conditioning an input variable..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Process and arrangement for conditioning an input variable... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2489112