Convergent actor critic-based fuzzy reinforcement learning...

Data processing: artificial intelligence – Plural processing systems

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C706S001000, C706S002000, C706S003000, C706S004000, C706S005000, C706S006000, C706S007000, C706S008000, C706S009000

Reexamination Certificate

active

06917925

ABSTRACT:
A system is controlled by an actor-critic based fuzzy reinforcement learning algorithm that provides instructions to a processor of the system for applying actor-critic based fuzzy reinforcement learning. The system includes a database of fuzzy-logic rules for mapping input data to output commands for modifying a system state, and a reinforcement learning algorithm for updating the fuzzy-logic rules database based on effects on the system state of the output commands mapped from the input data. The reinforcement learning algorithm is configured to converge at least one parameter of the system state to at least approximately an optimum value following multiple mapping and updating iterations. The reinforcement learning algorithm may be based on an update equation including a derivative with respect to at least one parameter of a logarithm of a probability function for taking a selected action when a selected state is encountered.

REFERENCES:
patent: 5250886 (1993-10-01), Yasuhara et al.
patent: 5257343 (1993-10-01), Kyuma et al.
patent: 5608843 (1997-03-01), Baird, III
patent: 6529887 (2003-03-01), Doya et al.
patent: 6601049 (2003-07-01), Cooper
patent: 6633858 (2003-10-01), Yamakawa et al.
patent: 2003/0179717 (2003-09-01), Hobbs et al.
R. S. Sutton et al, “Toward a Modern Theory of Adaptive Networks: Expectation and Prediction.”,Psychological Review, vol. 88, pp. 135-170, 1981, no month.
A. G. Barto et al., “Neuronlike Adaptive Elements That Can Solve Difficult Learning Control Problems”,IEEE Transactions on Systems, Man, and Cybermetics, vol. 13, pp. 834-846, Sep./Oct. 1983.
T. Takagi et al. “Fuzzy Identification of Systems and Its Application to Modeling and Control”,IEEE Transactions on Systems, Man, and Cybermatics, vol. 15, No. 1, pp. 116-132, Jan./Feb. 1985.
R. J. Williams et al., “Toward a Theory of Reinforcement-Learning Connectionist Systems”,Technical Report NU-CC-88-3, Northeastern University, College of Computer Science, Jul. 1988.
M. Sugeno, “Structure Identification Of Fuzzy Model”,Fuzzy Sets and Systems., vol. 28, pp. 15-33,1998, no month.
C. J. H. Watkins, “Learning from Delayed Rewards.”, Ph.D. Thesis, Cambridge University, May 1989.
H. R. Berenji, “A Reinforcement Learning-Based Architecture For Fuzzy Logic Control”,International Journal of Approximate Reasoning, vol. 6, No. 2, 267-292, Feb. 1992.
H. R. Berenji et al., “Learning and Tuning Fuzzy Logic Controllers Through Reinforcements”,IEEE Transactions on Neural Networks, vol. 3, No. 5, pp. 724-740, Sep. 1992.
B. Kosko, “Fuzzy Systems as Universal Approximators”,IEEE Intl. Conf. On Fuzzy Systems(FUZZ-IEEE '92), pp. 1153-1162, 1992, no month.
L.-X. Wang, “Fuzzy systems are universal approximators”,IEEE International Conference on Fuzzy Systems(FUZZ-IEEE '92), pp. 1163-1169, 1992, no month.
D. A. White et al.,Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches, Van Nostrand Reinhold, pp. 93-139, 1992.
H. Berenji et al., “Clustering in Product Space for Fuzzy Inference”,Second IEEE International Conference Fuzzy Systems, vol. II, pp. 1402-1407, Mar./Apr. 1993.
H. Berenji, “Fuzzy Q-Learning: A New Approach for Fuzzy Dynamic Programming”,Third IEEE International Conference on Fuzzy Systems, vol. 1, pp. 486-491, Jun. 1994.
“CDG: CDMA Technology: About CDMA Technology: Introduction to CDMA”, [Internet] http://www.cdg.org/tech/a_ross/Intro.asp, 2 pages, printed Oct. 8, 2002.
CDG: CDMA Technology: About CDMA Technology: Multiple Access Wireless Communications, [Internet] http://www.cdg.org/tech/a_ross/MultipleAccess.asp, 3 pages, printed Oct. 8, 2002.
CDG: CDMA Technology: About CDMA Technology: The CDMA Revolution, [Internet] http://www.cdg.org/tech/a_ross/CDMARevolution.asp, 7 pages, printed Oct. 8, 2002.
CDG: CDMA Technology: About CDMA Technology: Common Air Interface, [Internet] http://www.cdg.org/tech/a_ross/CAI.asp, 1 page, printed Oct. 8, 2002.
CDG: CDMA Technology: About CDMA Technology: Forward CDMA Channel, [Internet] http://www.cdg.org/tech/a_ross/Forward.asp, 5 pages, printed Oct. 8, 2002.
CDG: CDMA Technology: About CDMA Technology: Reverse CDMA Channel, [Internet] http://www.cdg.org/tech/a_ross/Reverse.asp, 4 pages, printed Oct. 8, 2002.
T. Jaakola et al., “Reinforcement Learning Algorithms for Partially Observable Markov Decision Problems”,Advances in Neural Information Process Systems, vol. 7, pp. 345-352, 1995, no month.
H. R. Berenji, “Fuzzy Q-Learning for Generalization of Reinforcement Learning”,Fifth IEEE International Conference on Fuzzy Systems, vol. 3, pp. 2208-2214, Sep. 1996.
“Fuzzy Logic, Neural Networks, and Genetic Algorithms” Conference Advertisement, Intelligent Inference Systems Corp. Conference Date Oct. 1996.
D. P. Bertsekas et al.,Neuro-Dynamic Programming, Athena Scientific, 1996, no month.
H. R. Berenji et al., “Refining the Shuttle Training Aircraft Controller”,Sixth IEEE International Conference On Fuzzy Systems, vol. II, pp. 677-681, Jul. 1997.
R. S. Sutton et al,Reinforcement Learning: An Introduction, MIT Press, 1998, no month.
H. R. Berenji et al., “Cooperation and Coordination Between Fuzzy Reinforcement Learning Agents in Continuous State Partially Observable Markov Decision Process”,Proceedings of the 8thIEEE International Conference on Fuzzy Systems(FUZZ-IEEE '99), pp. 621-627, Aug. 1999.
V. Konda et al., “Actor-Critic Algorithms”,Advances in Neural Information Processing Systems, vol. 12, pp. 1008-1014, Nov. 1999.
L. C. Baird, “Gradient Descent for General Reinforcement Learning”,Advances in Neural Information Processing Systems 11, 7 pages, 1999, no month.
S. V. Hanly et al., “Power Control and Capacity of Spread Spectrum Wireless Networks”,Automatica, vol. 36, No. 12, pp. 1987-2012, 1999, no month.
N. Bambos,et al., “Power Controlled Multiple Accesc (PCMA) in Wireless Communication Networks”,Proceedings of IEEE Conference on Computer Communications(IEEE Infocom 2000), New York, Mar. 2000.
J. Baxter, “Reinforcement Learning in POMDP's via Direct Gradient Ascent”,Proceedings of the 17thInternational Conference on Machine Learning, pp. 41-48, Jun. 29-Jul. 2, 2000.
D. Vengerov, “Adavantages of Cooperation Between Rinforcement Learning Agents in Difficult Stochastic Problems”,Proceedings of the 9thIEEE International Conference on Fuzzy Systems(FUZZ-IEEE 2000), pp. 871-876, 2000, no month.
R. S. Sutton, “Policy gradient methods for reinforcement learning with function approximation”,Advances in Neural Information processing systems 12, pp. 1057-1063, 2000, no month.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Convergent actor critic-based fuzzy reinforcement learning... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Convergent actor critic-based fuzzy reinforcement learning..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Convergent actor critic-based fuzzy reinforcement learning... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3426004

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.