Dynamic programming network

Data processing: artificial intelligence – Neural network – Learning task

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C706S025000, C706S019000, C706S012000

Reexamination Certificate

active

06513022

ABSTRACT:

RIGHTS OF THE GOVERNMENT
The invention described herein may be manufactured and used by or for the Government of the United States for all governmental purposes without the payment of any royalty.
BACKGROUND OF THE INVENTION
The field of the invention relates to dynamic programming and more specifically to dynamic programming within a network structure.
Dynamic programming is a process that discovers an optimal trajectory toward a goal by deriving values of states encountered during exploration from values of succeeding states. The various forms of dynamic programming such as Q-learning, TD-Lamda, value iteration, and advantage learning often require extensive exploration and large amounts of memory for maintaining values for the vast number of states typically encountered in useful applications.
Function approximators, such as neural networks, can be used in the art to mitigate the memory liability associated with most forms of dynamic programming and afford good performance with the experience of just a small sample of all possible states. The dynamic programming network offers an alternative or collateral strategy by intelligently directing sensors toward regions of interest within a state, processing and retaining only information that contributes to the achievement of an objective. An experienced dynamic programming network may therefore considerably reduce the amount of exploration necessary to arrive at an optimal or good-enough solution.
Current methods for identifying military targets usually attempt to match a target with images stored within a database. Usually the database is quite large and contains real or synthetic images of the same target in as many orientations, configurations, and articulations as possible. Not all variations can be anticipated and insignificant variations can hinder finding a match. Military applications require considerably more time searching for a match than the duration of a mission and, because of the size of the database, processing cannot be performed aboard a tactical aircraft. The dynamic programming network of this invention conserves memory and processes image data with profound speed when implemented on the fine-grained parallel computers for which it was designed.
Sensor management, sensor fusion, and target recognition are seldom integrated well and are at best essentially independent software modules that only exchange data. The few modules known to adapt do so almost independently of the requirements of these other functions. Further, experts generally handcraft these functions so they are tailored to a specific environment, rendering the functions rigid in their application. The dynamic programming network integrates these functions seamlessly.
The present invention may be accurately described as a dynamic programming network. It cannot compare directly with known error-backpropagation neural networks because the error that back-propagates in such neural networks derives from a known desired response whereas a dynamic programming network must discover an unknown desired response after a lengthy trial and error search of states. The present invention allows for the possibility of a dynamic programming network using a function approximator to maintain the elements' state values to learn to accept via its sensors an error or desired response.
The dynamic programming network conserves memory and processes image data with profound speed. The method of the invention is not rigid to a specific application but can be used in a wide variety of applications with minimal tailoring.
SUMMARY OF THE INVENTION
The dynamic programming network integrates sensor management, sensor fusion, and an application in a seamless structure in which these functions are mutually dependent and develop autonomously and concomitantly with experience. The dynamic programming network autonomously divides these functions into multiple subtasks that it can assign to the processors of a fine-grained parallel computer. As the number of processors available for these subtasks increases the network may attain its objective more efficiently. This architecture confers the greatest advantage in feature-rich applications such as identification of targets in synthetic aperture radar, visual, and infrared images. The design can be extended, however, to such diverse and general applications as control problems and machine intelligence. For the pattern recognition application described here, the dynamic programming network detects, selects, and identifies features and patterns comprising those features via a series of observations rather than processing all data available in each image, thereby minimizing sensor usage and volume of data processed. The network remembers similar features contained in many images instead of many images containing similar features, thus conserving memory and facilitating data retrieval.
It is therefore an object of the invention to provide an efficient, memory conserving dynamic programming system and method.
It is another object of the invention to provide an infinitely scalable dynamic programming network.
It is another object of the invention to provide a dynamic programming network that integrates sensor management, sensor fusion and an application in a seamless structure in which these functions are mutually dependent and develop autonomously and concomitantly with experience.
These and other objects of the invention are described in the description, claims and accompanying drawings and are achieved by an efficient, memory conserving, application integrating dynamic programming method comprising the steps of:
establishing a prototype element of a network, said establishing comprising the steps of:
assigning a table or function approximator for maintaining state values;
identifying a method for determining element state based on state values maintained from said assigning step;
applying a process for dynamically programming said element's state values based on succeeding state values resulting from said element's state from said identifying step;
connecting a plurality of elements from said establishing step to form a network;
coupling signal transmitting sensors to elements from said connecting step;
coupling elements from said connecting step to effectors;
maintaining within each element a running average of values for the state of an element in a cycle after such value occurs;
cycling said network by determining the state of all elements and sensors therein, selecting as each element's state the highest running average value from said maintaining step;
sending an output signal to network effectors; and
presenting to said sensors a pattern based on a state that results from effector activity from said sending step.


REFERENCES:
patent: 5195167 (1993-03-01), Bahl et al.
patent: 5416889 (1995-05-01), Takahashi et al.
patent: 5452400 (1995-09-01), Takahashi et al.
patent: 5608843 (1997-03-01), Baird, III
patent: 5644681 (1997-07-01), Takahashi et al.
patent: 5689622 (1997-11-01), Higashino et al.
patent: 5727081 (1998-03-01), Burges et al.
patent: 5857178 (1999-01-01), Kimura et al.
patent: 5946673 (1999-08-01), Francone et al.
patent: 6023693 (2000-02-01), Masuoka et al.
patent: 6353814 (2002-03-01), Weng
Mizutani, E., Sample Path-Based Policy-Only Learning By Actor Neural Networks, International Joint Conference on Neural Networks, Jul. 1999, vol. 2, pp. 1245-1250.*
Berenji, H., Fuzzy Q-Learning: A New Approach for Fuzzy Dynamic Programming, Proceedings of the 3rd IEEE Conference on Fuzzy Systems, Jun. 1994, vol. 1, pp. 489-491.*
Horiuchi et al., Q-PSP Learning: An Exploitation-Oriented Q-Learning Algorithm and Its Applications, Proceedings of the IEEE International Conference on Evolutionary Computation, May 1996, pp. 76-81.*
Bersini et al., Three Connectionist Implementations of Dynamic Programming for Optimal Control: A Preliminary Comparative Analysis, Int'l Workshop on Neural Networks for ID, Control, Robotics and Signal/Image Processing, Aug. 1996, pp. 428-437.*
Bertsekas, D.P., New Value Iteration and Q

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Dynamic programming network does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Dynamic programming network, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Dynamic programming network will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3015849

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.