Proceedings:
Proceedings of the Twentieth International Conference on Machine Learning
Volume
Issue:
Proceedings of the Twentieth International Conference on Machine Learning
Track:
Contents
Downloads:
Abstract:
We present a learning framework for Markovian decision processes that is based on optimization in the policy space. Instead of using relatively slow gradient-based optimization algorithms, we use the fast Cross Entropy method. The suggested framework is described for several reward criteria and its effectiveness is demonstrated for a grid world navigation task and for an inventory control problem.
ICML
Proceedings of the Twentieth International Conference on Machine Learning