Data processing: artificial intelligence – Neural network – Learning method
Reexamination Certificate
1999-05-05
2001-06-19
Chaki, Kakali (Department: 2122)
Data processing: artificial intelligence
Neural network
Learning method
C706S021000, C706S023000, C706S027000
Reexamination Certificate
active
06249781
ABSTRACT:
FIELD OF THE INVENTION
The current invention relates generally to a method and apparatus for machine learning of a pattern sequence and more particularly to a method and apparatus for machine learning of a pattern sequence utilizing an incrementally adjustable gain parameter.
BACKGROUND OF THE INVENTION
The task of learning by a machine a pattern sequence which is a linear function of multiple inputs is a central problem in many technical fields including adaptive control and estimation, signal processing, artificial intelligence, pattern recognition, and neural networking. The machine must perform responsive tracking of the pattern sequence in real time while achieving fast convergence in a computationally efficient manner. Often the process of learning the pattern sequence is made more difficult in that very little prior knowledge of the system generating the sequence is known. Moreover, while the inputs to the machine for learning the pattern may be identified, the relevance and weight of each input in affecting the output pattern sequence is usually not known.
Methods of determining the relevance of a particular input along with a specific weight are known. The weights are derived from a modifiable gain parameter. The gain parameter is modified based on the auto-correlation of the increments in the identified input. When the gain parameter is positively correlated with a certain average of the preceding input increments, the gain parameter is increased. Conversely if the input increments are negatively correlated the gain parameter is decreased. The gain parameters are adjusted to enhance the efficiency and responsiveness of the learning process.
Prior techniques for adapting the gain parameter of an adaptive learning process have been disclosed by Kesten in “Accelerated Stochastic Approximation”,
Annals of Mathematical Studies,
Vol 29, 1958, pp 41-59. The Kesten method reduces gain parameters or moves them along a fixed schedule converging to zero. The method can not find a gain level appropriate to the dynamics of a non-stationary task and is limited to a single gain parameter for the entire system.
A method entitled Delta-Bar-Delta (DBD) for accelerating convergence of neural networks is disclosed by Jacobs in “Increased Rates of Convergence Through Learning Rate Adaptation”,
Neural Networks,
vol. 1, 1988, pp 295-307, by Chan et al. in “An Adaptive Training Algorithm for Back Propagation Networks”, Cambridge University Engineering Department Technical Report, CUED/F-INFENG/TR.2, 1987, by Tollenaere in “SuperSAB: Fast Adaptive Back Propagation with Good Scaling Properties”,
Neural Networks,
vol. 3, 1990, pp. 561-573, by Devos et al. in “Self Adaptive Back Propagation”, Proceedings NeuroNimes, 1988, EZ, Nanterre, France, and by Lee et al. in “Practical Characteristics of Neural Network and Conventional Pattern Classifiers on Artificial and Speech Problems”,
Advances in Neural Information Processing Systems,
vol. 2, 1990, pp 168-177. These DBD methods do not operate incrementally and are not dynamic. The methods modify the gain parameters after a complete pass through the training set and thus can not be applied to an on-line learning situation.
Classical estimation methods including the Kalman filter, Least-Squares methods, Least-Mean-Squares (LMS), and normalized LMS are described by Goodwin et al. in
Adaptive Filtering Prediction and Control,
Prentice Hall, 1984. These methods can be divided into classes with differing disadvantages. The Kalman filter method offers optimal performance in terms of tracking error, but requires more detailed knowledge of the task domain than is usually available. In particular, it requires complete knowledge of the statistics of the unknown system's time variation. The least-squares methods requires less such knowledge, but does not perform as well. In addition, both of these methods require a great deal of memory and computation. If the primary learning process has N parameters, then the complexity of these methods is of the order of N
2
. That is, their memory and computational requirements increase with the square of the number of parameters being estimated. In many applications this number is very large, making these methods undesirable. The LMS and Normalized LMS methods are much less complex, requiring memory and computation that is only of order N. However, these methods have slow convergence.
Thus it is desirable to discover a method of machine learning that achieves fast convergence and has responsive tracking of a pattern sequence without excessive computation, system knowledge, or intervention in a real time system.
OBJECTS OF THE INVENTION
Accordingly, it is a primary object of this invention to obviate the above noted and other disadvantages of the prior art.
It is a further object of the invention to provide a novel machine apparatus for detecting and learning pattern sequences.
It is a yet further object of the invention to provide a novel method apparatus for detecting and learning pattern sequences.
SUMMARY OF THE INVENTION
The above and other objects and advantages are achieved in one aspect of this invention with a method and apparatus for machine learning of a pattern sequence using an incrementally adaptive gain parameter to adjust the learning rate of the machine. The machine receives a plurality of inputs that may correspond to sensor information or the like and predicts the pattern sequence from past experience and the input values. Each input has associated with it an individual gain parameter and learning rate. The gain parameters are increased or decreased in real time in correlation with the accuracy of the learning process.
In one aspect of the invention, the pattern sequence is predicted utilizing a weighted linear combination of the inputs. The particular weights are derived from the individual learning rates of the inputs and the associated gain parameters.
The disclosed method and apparatus are advantageously utilized in signal processing, adaptive control systems, and pattern recognition.
REFERENCES:
patent: 4933872 (1990-06-01), Vandenburg et al.
patent: 5092343 (1992-03-01), Spitzer et al.
patent: 5181171 (1993-01-01), McCormack et al.
patent: 5214746 (1993-05-01), Fogel et al.
patent: 5220640 (1993-06-01), Frank
patent: 5255348 (1993-10-01), Nenov
patent: 5446829 (1995-08-01), Wang et al.
patent: 5479571 (1995-12-01), Parlos et al.
patent: 5659667 (1997-08-01), Buescher et al.
patent: 5812992 (1998-09-01), De Vries
patent: 6169981 (2001-01-01), Werbos
Humphrey, W.S.; Singpurwalla, N.D., Predicting (individual) software productivity, Software Engineering, IEEE Transactions on, vol.: 17, Feb. 1991, pp: 196-207.*
Bishop, W.B.; Djuric, P.M.; Johnston, D.E., Bayesian Model Selection of Exponential Time Series Through Adaptive Importance Sampling, Statistical Signal and Array Processing., IEEE Seventh SP Workshop on, Jun. 26-29, 1994, pp: 51-54.*
Badri, M.A., Neural networks of combination of forecasts for data with long memory pattern, Neural Networks, 1996., IEEE International Conference on, vol.: 1, Jun. 3-6, 1996, pp: 359-364 vol. 1.*
Bittencourt, M.A.; Lin, F.C., Time series for currency exchange rate of the Brazilian Real, Computational Intelligence for Financial Engineering, 2000. (CIFEr) Proceedings of the IEEE/IAFE/INFORMS 2000 Conference on, Mar. 26-28, 2000, pp: 193-196.*
Singh, S.; Stuart, E., A pattern matching tool for time-series forecasting Pattern Recognition, 1998. Proceedings. Fourteenth International Conference on, vol.: 1, Aug. 16-20, 1998, pp: 103-105 vol. 1. Aug. 1988.*
Komprej, I.; Zunko, P., Short term load forecasting, Electrotechnical Conference, 1991. Proceedings., 6th Mediterranean, May 22-24, 1991, pp: 1470-1473 vol. 2.*
Lee, H.B., Eigenvalues and eigenvectors of covariance matrices for signals closely spaced in frequency, Signal Processing, IEEE Transactions on, vol.: 40 10 , Oct. 1992, pp: 2518-2535, May 1991.*
Sorensen, O; “Neural Networks Performing System Identification for Control Applications”; Aalborg University; pp. 172-176.*
Sterzing, V.et al.; “Recurr
Chaki Kakali
Leonard Charles Suchyta
Starks Wilbert
Verizon Laboratories Inc.
LandOfFree
Method and apparatus for machine learning does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and apparatus for machine learning, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for machine learning will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2476944