Proceedings:
Vol. 22 (2012): Twenty-Second International Conference on Automated Planning and Scheduling
Volume
Issue:
Vol. 22 (2012): Twenty-Second International Conference on Automated Planning and Scheduling
Track:
Full Technical Papers
Downloads:
Abstract:
Recent research leverages results from the continuous-armed bandit literature to create a reinforcement-learning algorithm for continuous state and action spaces. Initially proposed in a theoretical setting, we provide the first examination of the empirical properties of the algorithm. Through experimentation, we demonstrate the effectiveness of this planning method when coupled with exploration and model learning and show that, in addition to its formal guarantees, the approach is very competitive with other continuous-action reinforcement learners.
DOI:
10.1609/icaps.v22i1.13507
ICAPS
Vol. 22 (2012): Twenty-Second International Conference on Automated Planning and Scheduling