Data processing: artificial intelligence – Neural network – Learning method
Reexamination Certificate
1999-10-28
2003-05-06
Follansbee, John A. (Department: 2121)
Data processing: artificial intelligence
Neural network
Learning method
Reexamination Certificate
active
06560586
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates to a learning process for a neural network.
2. Background Books and Articles
The following books and articles are useful items for understanding the technical background of this invention, and each item is incorporated in its entirety by reference for its useful background information. Each item has an item identifier which is used in the discussions below.
i. B. L. Bowerman and R. T. O'Connell,
Time Series Forecasting,
New York: PWS, 1987.
i. G. E. P. Box and G. M. Jenkins,
Time Series Analysis, Forecasting, and Control,
San Francisco, Calif.: Holden-Day, 1976.
iii. A. Cichocki and R. Umbehauen, Neural Networks for Optimization and Signal Processing, New York: Wiley, 1993.
iv. A. S. Weigend and N. A. Gershenfeld, Eds.,
Time Series prediction: Forecasting the Future and Understanding the Past,
Reading, Mass.: Addison-Wesley, 1994.
v. A. Lepedes and R. Farber, “Nonlinear signal processing using neural network: Prediction and System Modeling,” Los Alamos Nat. Lab. Tech. Rep. LA-UR 87-2662, 1987.
vi. K. Hornik, “Approximation Capability of Multilayer Feedforward Networks,”
Neural Networks,
vol. 4, 1991.
vii. M. Leshno, V. Y. Lin A. Pinkus and S. Schocken. “Multilayer feedforward networks with a nonpolynomial activation function can approximate any function,”
Neural Networks,
vol. 6, pp. 861-867, 1993.
viii. S. G. Mallat, “A Theory for Multiresolution Signal Decomposition: the wavelet Representation,”
IEEE Trans, Pattern Anal. Machine Intell.,
vol. 11, pp. 674-693, July 1989.
ix. E. B. Baum and D. Haussler, “What Size Net Gives Valid Generalization,”
Neural Comput.,
vol. 1, pp. 151-160, 1989.
x. S. German, E. Bienenstock and R. Doursat, “Neural Networks and the Bias/Variance Dilemma,”
Neural Comput.,
vol. 4, pp. 1-58, 1992.
xi. K. J. Lang, A. H. Waibel, and G. E. Hinton, “A time—delay neural network architecture for isolated word recognition,”
Neural Networks,,
vol. 3, pp. 23-43, 1990.
xii. Y. LeCun. “Generalization and network design strategies,” Univ. Toronto, Toronto, Ont., Canada, Tech. Rep. CRG-TR-89-4, 1989.
xiii. E. A. Wan, “Time Series Prediction by Using a Connectionist Network With Internal Delay Lines,” Time Series Prediction:
Forecasting the Future and Understanding the Past.
Reading, Mass.: Addison-Wesley, 1994, pp. 195-218
xiv. D. C. Plaut, S. J. Nowlan, and G. E. Hinton, “Experiments on Learning by BackPropagation,” Carnegie Mellon Univ., Pittsburgh, Pa. Tech. Rep., CMU-CS-86-126, 1986.
xv. A. Krogh and J. A. Hertz, “A Simple Weight Decay Can Improve Generalization,”
Adv., Neural Inform. Process. Syst.,
vol. 4. pp. 950-957.
xvi. A. S. Weigend, D. E. Rumelhart, and B. A. Huberman, “Back-propagation, weight-elimination and time series prediction,” In
Proc. Connenectionist Models Summer Sch.,
1990, pp. 105-116.
xvii. A. S. Weigend, B. A. Huberman, and D. E. Rumelhart, “Predicting the Future: A Connectionist Approach,”
Int. J. Neural Syst.,
vol. 1. no. 3. pp. 193-209, 1990.
xviii. M. Cottrell, B. Girard, Y. Girard, M. Mangeas, and C. Muller, “Neural Modeling for Time Series: A Statistical Stepwise Method for Weight Elimination,”
IEEE Trans. Neural Networks.,
vol. 6. pp. 1355-1364. November 1995.
xix. R. Reed. “Pruning Algorithms—A Survey,”
IEEE Trans. Neural Networks,
vol. 4, pp. 740-747, 1993.
xx. M. B. Priestley, Non-Linear and Non-Stationary Time Series Analysis, New York; Academic, 1988.
xxi. Y. R. Park, T. J. Murray, and C. Chen, “Predicting Sun Spots Using a Layered perception Neural Netowrk,”
IEEE Trans. Neural Networks,
Vol. 7, pp. 501-505, March 1996.
xxii. W. E. Leland and D. V. Wilson. “High Time-resolution Measurement and Analysis of Ian Traffic: Implications for Ian Interconnection,” in Proc.
IEEE INFOCOM,
1991, PP. 1360-1366.
xxiii. W. E. Leland, M. S. Taqqu. W. Willinger and D. V. Wilson, “On the Self-Similar Nature of Ethernet Traffic,” in
Proc. ACM SIGCOMM,
1993, pp. 183-192.
xxiv. W. E. Leland, M. S. Taqqu, W. Willinger and D. V. Wilson. “On the Self Similar Nature of Ethernet Traffic (Extended Version),”
IEE/ACM Trans. Networking,
Vol. 2, pp. 1-15, Febuary 1994.
Related Work
Traditional time-series forecasting techniques can be represented as autoregressive integrated moving average models (see items i and ii, above). The traditional models can provide good results when the dynamic system under investigation is linear or nearly linear. However, for cases in which the system dynamics are highly nonlinear, the performance of traditional models might be very poor (see items iii and iv, above). Neural networks have demonstrated great potential for time-series prediction. Lepedes and Farber (see item v) first proposed using multilayer feedforward neural networks for nonlinear signal prediction in 1987. Since then, research examining the approximation capabilities of multilayer feedforward neural networks (see items vi and vii) has justified their use for nonlinear time-series forecasting and has resulted in the rapid development of neural network models for signal prediction.
A major challenge in neural network learning is to ensure that trained networks possess good generation ability, i.e., they can generalize well to cases that were not included in the training set. Some research results have suggested that, in order to get good generalization, the training set should form a substantial subset of the sample space (see ix and x). However, obtaining a sufficiently large training set is often impossible in many practical real-world problems where there are only a relatively small number of samples available for training.
Recent approaches to improving generalization attempt to reduce the number of free weight parameters in the network. One approach is weight sharing as employed in certain time-delay neural networks (TDNN's) (see xi and xii) and finite impulse (FIR) networks (see xiii). However, this approach usually requires that the nature of the problem be well understood so that designers know how weights should be shared. Yet another approach is to start network training using an excessive number of weights and then remove the excess weights during training. This approach leads to a family of pruning algorithms including weight decay (see xv), weight-elimination (see xvi and xvii), and the statistical step-wise method (SSM, see xviii). For a survey of pruning techniques, see item xix. While pruning techniques might offer some benefit, this approach remains inadequate for difficult learning problems. As mentioned in item xix, for example, it is difficult to handle multi-step prediction with the statistical stepwise method.
There is therefore a need for a neural network learning process that gives a trained network possessing good generalization ability so as to provide good results even when the dynamic system under investigation is highly nonlinear.
SUMMARY OF THE INVENTION.
It is the object of this invention to provide a neural network learning process that provides a trained network that has good generalization ability for even highly nonlinear dynamic systems. In one embodiment, the objective is realized in a method of predicting a value for a series of values. According to this method, several approximations of a signal are obtained, each at a different respective resolution, using the wavelet transformation. Then, a neural network is trained using, successively, the approximations in order beginning with the lowest resolution approximation and continuing up through the higher resolution approximations. The trained neural network is used to predict values, and has good generalization even for highly nonlinear dynamic systems. In a preferred embodiment of the invention, the trained neural network is used in predicting network traffic patterns.
REFERENCES:
patent: 5621861 (1997-04-01), Hayashi et al.
patent: 6285992 (2001-09-01), Kwasny et al.
Wing-Chung Chan et al; Transformation of Back-Propagation Networks in Multiresolution Learning; 1994; IEEE; INSPEC 4903733; 290-294.
Liang Yao
Page Edward W.
Alcatel
Follansbee John A.
Hirl Joseph P.
Sughrue & Mion, PLLC
LandOfFree
Multiresolution learning paradigm and signal prediction does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Multiresolution learning paradigm and signal prediction, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Multiresolution learning paradigm and signal prediction will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3002216