Automated method for building a model

Data processing: artificial intelligence – Neural network – Learning task

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C706S906000, C706S907000

Reexamination Certificate

active

06243696

ABSTRACT:

BACKGROUND OF THE INVENTION
A common problem that is encountered in training neural networks for prediction, forecasting, pattern recognition, sensor validation and/or processing problems is that some of the training/testing patterns might be missing, corrupted, and/or incomplete. Prior systems merely discarded data with the result that some areas of the input space may not have been covered during training of the neural network. For example, if the network is utilized to learn the behavior of a chemical plant as a function of the historical sensor and control settings, these sensor readings are typically sampled electronically, entered by hand from gauge readings and/or entered by hand from laboratory results. It is a common occurrence that some or all of these readings may be missing at a given time. It is also common that the various values may be sampled on different time intervals. Additionally, any one value may be “bad” in the sense that after the value is entered, it may be determined by some method that a data item was, in fact, incorrect. Hence, if the data were plotted in a table, the result would be a partially filled-in table with intermittent missing data or “holes”, these being reminiscent of the holes in Swiss cheese. These “holes” correspond to “bad” or “missing” data. The “Swiss-cheese” data table described above occurs quite often in real-world problems.
Conventional neural network training and testing methods require complete patterns such that they are required to discard patterns with missing or bad data. The deletion of the bad data in this manner is an inefficient method for training a neural network. For example, suppose that a neural network has ten inputs and ten outputs, and also suppose that one of the inputs or outputs happens to be missing at the desired time for fifty percent or more of the training patterns. Conventional methods would discard these patterns, leading to training for those patterns during the training mode and no reliable predicted output during the run mode. This is inefficient, considering that for this case more than ninety percent of the information is still there for the patterns that conventional methods would discard. The predicted output corresponding to those certain areas will be somewhat ambiguous and erroneous. In some situations, there may be as much as a 50% reduction in the overall data after screening bad or missing data. Additionally, experimental results have shown that neural network testing performance generally increases with more training data, such that throwing away bad or incomplete data decreases the overall performance of the neural network.
In addition to the above, when data is retrieved on different time scales, it is necessary to place all of the data on a common time scale. However, this is difficult in that for a given time scale, another and longer time scale results in missing data at that position. For example, if one set of data were taken on an hourly basis and another set of data were taken on a quarter hour basis, there would be three areas of missing data if the input time scale is fifteen minutes. This data must be filled in to assure that all data is presented at synchronized times to the system model. Worse yet, the data sample periods may be non-periodic, producing totally asynchronous data.
In addition, this data may be taken on different machines in different locations with different operating systems and quite different data formats. It is essential to be able to read all of these different data formats, keeping track of the data value and the time-stamp of the data out to one or more “flat files” which are column oriented, each column corresponding to a data variable and/or the data/time stamp of the variable. It is a formidable task to retrieve this data keeping track of the date-time information and read it into an internal data-table (spreadsheet) so that the data can be time merged.
Another aspect of data integrity is that with respect to inherent delays in a system. For example, in a chemical processing system, a flow meter output can provide data at time t
0
at a given value. However, a given change in flow resulting in a different reading on the flow meter may not affect the output for a predetermined delay &tgr;. In order to predict what the output would be, this flow meter output must be input to the network at a delay equal to &tgr;. This must also be accounted for in the training of the network. In generating data that accounts for time delays, it has been postulated that it would be possible to generate a table of data that comprises both original data and delayed data. This necessitates a significant amount of storage in order to store all of the delayed data and all of the original data, wherein only the delayed data is utilized. Further, in order to change the value of the delay, an entirely new set of input data must be generated off the original set.
SUMMARY OF THE INVENTION
The present invention disclosed and claimed herein comprises a method for creating a representation of a plant and incorporating it into a run time prediction system for generating predicted output values representing the operating parameters of the plant during operation thereof. A historical database is provided representing the operation of the plant and comprised of data associated with plant inputs and plant Data is extracted from the historical database and then a dataset of variables corresponding to the inputs and outputs from the historical database is created. An off-line predictive model of the plant is then created utilizing the created dataset to predict a plant output, the off-line model defined by off-line model parameters. An on-line model is then created for generating predicted output values in real time during the operation the a plant and defined by on-line model parameters. The on-line model parameters are then replaced with the off-line model parameters after generation thereof.
In another aspect of the present invention, a graphical interface is provided a user to assist the user in performing the steps. Each step is facilitated with an interactive graphical interface with specific instructions and data input inquiries for the associated step to assist the user at that particular step.
In yet another aspect of the present invention, a method for determining an output value having a known relationship to an input value with a predicted value is provided. The method includes training a predictive model with a set of known outputs for a given set of inputs that exist in a finite dataset. This is followed by the step of inputting data to the predictive model that is within the set of given inputs. Then an output is predicted from the predictive model that corresponds to the given input such that a predicted output value will be obtained which will have associated therewith the errors of the predictive model.


REFERENCES:
patent: 4910691 (1990-03-01), Skeirik
patent: 4965742 (1990-10-01), Skeirik
patent: 5006992 (1991-04-01), Skeirik
patent: 5111531 (1992-05-01), Grayson et al.
patent: 5121467 (1992-06-01), Skeirik
patent: 5150313 (1992-09-01), van den Engh et al.
patent: 5353207 (1994-10-01), Keeler et al.
patent: 5444820 (1995-08-01), Tzes et al.
patent: 5461699 (1995-10-01), Arababi et al.
patent: 5581459 (1996-12-01), Enbutsu et al.
patent: 5613041 (1997-03-01), Keeler et al.
patent: 5704011 (1997-12-01), Hensen et al.
patent: 5720003 (1998-02-01), Chiang et al.
patent: 6002839 (1999-12-01), Keeler et al.
patent: 0262647A3 (1986-09-01), None
patent: 0327268A2 (1988-02-01), None
patent: WO92/17951 (1991-04-01), None
“A Model for Temporal Correlation of Biological Neuronal Spike Trains,” David C. Tam and the late Donald H. Perkel,IJCNN International Joint Conference of Neural Networks, Department of Physiology and Biophysics, University of California, Irvine, CA 92717, pp. I-781-I-786.
Neural Network Architecture for Adaptive System Modeling and Control by Esther Levin, Raanan Gewirtzman and Gideon F. Inbar,Neural Networks, pp. 185-191, 1991.
“Towards Practical Control De

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Automated method for building a model does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Automated method for building a model, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Automated method for building a model will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2516108

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.