Proceedings:
Vol. 21 (2011): Twenty-First International Conference on Automated Planning and Scheduling
Volume
Issue:
Vol. 21 (2011): Twenty-First International Conference on Automated Planning and Scheduling
Track:
Short Papers
Downloads:
Abstract:
In this paper, we present a new algorithm that integrates recent advances in solving continuous bandit problems with sample-based rollout methods for planning in Markov Decision Processes (MDPs). Our algorithm, Hierarchical Optimistic Optimization applied to Trees (HOOT) addresses planning in continuous-action MDPs. Empirical results are given that show that the performance of our algorithm meets or exceeds that of a similar discrete action planner by eliminating the problem of manual discretization of the action space.
DOI:
10.1609/icaps.v21i1.13484
ICAPS
Vol. 21 (2011): Twenty-First International Conference on Automated Planning and Scheduling