Proceedings:
Book One
Volume
Issue:
Proceedings of the International Conference on Automated Planning and Scheduling, 27
Track:
Robotics Track
Downloads:
Abstract:
Optimizing policies for real-time control of humanoid robots is a difficult task due to the continuous and stochastic nature of the state and action spaces. In this paper, we propose a learning procedure to train a predictive motion model and RFPI, a solver for continuous-state and action MDP. We use the predictive model as a transition model to train policies for a robot soccer. Our method requires no external hardware, a small amount of human work and manages to outperform the expert policy used by our team Rhoban winning the last 2016 edition of the Robocup in kid-size soccer league. Moreover, the proposed method is able to adapt to non-holonomic robots more efficiently than the expert approach. Our results are confirmed by both simulations and real robot experiments.
DOI:
10.1609/icaps.v27i1.13861
ICAPS
Proceedings of the International Conference on Automated Planning and Scheduling, 27