Suppose we allow the controller to perform arbitrary search, and to base its control on the backed up information. To do this, we need to make decisions about the following: the order in which search nodes are expanded, and when to stop searching and actually "commit" to a control. The approach that we take is to view these decisions as the meta-level control problem. With some care in the formulation, it can be seen that a solution to this meta-level control problem will provide us with a bounded optimal controller. We would like to solve this problem by using algorithms from reinforcement learning.
Registration: ISBN 978-0-262-51106-3
Copyright: July 18-22, 1999, Orlando, Florida. Published by The AAAI Press, Menlo Park, California.