BROWSE TOPICS
RESOURCESABOUT THIS SITE |
Reinforcement LearningLearning through Reward & Punishment during Problem Solving AITopics > Machine Learning > Reinforcement Learning
Introductory ReadingsTopics in Reinforcement Learning and Glossary of Terminology in Reinforcement Learning. From the Reinforcement Learning Repository at the University of Massachusetts, Amherst. Reinforcement Learning: An Introduction. Richard S. Sutton and Andrew G. Barto. MIT Press, Cambridge, MA, 1998. View the HTML version of this publication but be advised that it "has a number of presentation problems, and its text is slightly different from the real book, but it may be useful for some purposes." The introduction (1.1) presents a very clear picture of what RL is ... and is not.
Reinforcement Learning: A Survey, L. P. Kaelbling, M. L. Littman and A. W. Moore (1996)J. Artificial Intelligence Research (JAIR), Volume 4, pages 237-285. "This paper surveys the field of reinforcement learning from a computer-science perspective. It is written to be accessible to researchers familiar with machine learning. Both the historical basis of the field and a broad selection of current work are summarized. Reinforcement learning is the problem faced by an agent that learns behavior through trial-and-error interactions with a dynamic environment. The work described here has a resemblance to work in psychology, but differs considerably in the details and in the use of the word ``reinforcement.'' The paper discusses central issues of reinforcement learning, including trading off exploration and exploitation, establishing the foundations of the field via Markov decision theory, learning from delayed reinforcement, constructing empirical models to accelerate learning, making use of generalization and hierarchy, and coping with hidden state. It concludes with a survey of some implemented systems and an assessment of the practical utility of current methods for reinforcement learning. " RL3 Lab - The Rutgers Laboratory for Real-Life Reinforcement Learning: " Our ultimate goal is to develop autonomous intelligent agents through the study of learning algorithms that strive to maximize reward. Our research seeks to expand the scope and applicability of existing reinforcement-learning work by focusing attention on real-life situations."
How to Make Software Agents Do the Right Thing: An Introduction to Reinforcement Learning. By Satinder Singh, Peter Norvig, and David Cohn,Adaptive Systems Group,Harlequin Inc. (1996). "Many people see agents and agent-based programming ushering in a new era in computing, particularly in the environment of the internet. ... That means we will need a way to describe our preferences to software agents, and a methodology for building agents that best satisfy our preferences. The pleasant surprise is that for many problems, once we know the preferences, we're almost done! Given the preferences, a list of possible actions, and enough time to practice taking actions, we can apply the formalism of Reinforcement Learning (or RL) to build an agent that acts according to the preferences in a near-optimal way. This article shows how." Dynamic Channel Allocation in Cellular Telephones: a Demo. Produced by Satinder Singh.
General ReadingsLearning to Play Black Jack with Artificial Neural Networks. By Andrés Perez-Urbie, Logic Systems Laboratory, Swiss Federal Institute of Technology-Lausanne. "Blackjack or twenty-one is a card game where the player attempts to beat the dealer, by obtaining a sum of card values that is equal to or less than 21 so that his total is higher than the dealer's. The probabilistic nature of the game makes it an interesting testbed problem for learning algorithms, though the problem of learning a good playing strategy is not obvious. ... We have explored the use of blackjack as a test bed for learning strategies in neural networks, and specifically with reinforcement learning techniques."
Technical Publications on Reinforcement Learning from the Reinforcement Learning Repository. Some of the topics covered are: Applications to Robotics, Distributed and Multi-Agent RL, Industrial Applications, Neuro-biological RL, Partially observable Problems, Planning, and TD-learning. Temporal Difference Learning and TD-Gammon. By Gerald Tesauro. Originally published in Communications of the ACM, March 1995 / Vol. 38, No. 3. "This article presents a game-learning program called TD-Gammon. TD-Gammon is a neural network that trains itself to be an evaluation function for the game of backgammon by playing against itself and learning from the outcome. Although TD-Gammon has greatly surpassed all previous computer programs in its ability to play backgammon, that was not why it was developed. Rather, its purpose was to explore some exciting new ideas and approaches to traditional problems in the field of reinforcement learning." Tutorial Slides & NotesTutorial: Reinforcement Learning Algorithms for MDPs. Csaba Szepesvári and Rich Sutton. July 11, 2010. Abstract "Reinforcement learning is a popular and highly-developed approach to artificial intelligence with a wide range of applications. By integrating ideas from dynamic programming, machine learning, and psychology, reinforcement learning methods have enabled much better solutions to large-scale sequential decision problems than had previously been possible. This tutorial will cover Markov decision processes and approximate value functions as the formulation of the reinforcement learning problem, and temporal-difference learning, function approximation, and Monte Carlo methods as the principal solution methods. The focus will be on the algorithms and their properties. Applications of reinforcement learning in robotics, game-playing, the web, and other areas will be highlighted. The main goal of the tutorial is to orient the AI researcher to the fundamentals and research topics in reinforcement learning, preparing them to evaluate possible applications and to access the literature efficiently." Reinforcement Learning Tutorial Slides by Andrew Moore. "Reinforcement Learning concerns the fascinating question of whether you can train a controller to perform optimally in a world where it may be necessary to suck up some short term punishment in order to achieve long term reward. We will discuss certainty-equivalent RL, the Temporal Difference (TD) learning, and finally Q-learning. The curse of dimensionality will be constantly learning over our shoulder, salivating and cackling. " Related ResourcesReinforcement Learning and Artificial Intelligence (RLAI). Currently hosted at the University of Alberta. "The ambition of this web site is to promote and support collaboration in reinforcement learning and artificial intelligence research throughout the world. RLAI research is research that is directed toward the long-standing goals of AI (understanding the mind, reproducing human abilities) and is based on reinforcement learning ideas (learning from and while interacting with the world)." Reinforcement Learning Repository. "The purpose of this web site is to provide a centralized resource for research on Reinforcement Learning (RL), which is currently an actively researched topic in artificial intelligence. This site contains resources on both RL research and applications to areas such as robotics and industrial problems. ... This web site was supported by the National Science Foundation and was developed at the Autonomous Agents Laboratory at Michigan State University. It is currently maintained by the Autonomous Learning Laboratory at the University of Massachusetts, Amherst." The Annual RL Competition: Official website for the Annual Reinforcement Learning Competition. This event provides a forum for reinforcement learning researchers to rigorously compare the performance of their methods on a suite of challenging domains. This page links to descriptions and results of past competitions.
Related AITopics PagesOther References OfflineRobots Are Learning, But No "Terminators" Are About To Appear. By David Isaac. Investor's Business Daily (December 31, 2004; subscription req'd.). "On the other hand, it turns out aspects of machine-learning can be compared to the way the mind works. Take reinforcement learning, one approach that is part of machine learning and human learning. This approach involves giving the robot a reward, essentially pushing a green button for good and a red button for bad.The robot is allowed to roam around. When it does what you want, you press the green button, [Tom] Mitchell says. The robot's computer program is designed so that pushing green increases the likelihood that the robot will repeat the action, say, docking with its recharger.It seems this is similar to how the mind functions. Mitchell says the chemical dopamine, which produces a sensation of pleasure, acts like the green button, as a reward signal." A Review of Reinforcement Learning Reviewed by Sebastian Thrun and Michael L. Littman. AI Magazine 21(1): 103-105 (Spring 2000). A Review of Reinforcement Learning. By Sebastian Thrun and Michael L. Littman. AI Magazine 21(1): Spring 2000, 103-105. Review of Reinforcement Learning: An Introduction, Richard S. Sutton and Andrew G. Barto, The MIT Press, Cambridge, Massachusetts, 1998, 322 pp., ISBN 0-262-19398-1. |
