Exploiting Belief Locality in Run-Time Decision-Theoretic Planners

William H. Turkett, Jr., Wake Forest University; and John R. Rose, University of South Carolina

While Partially-Observable Markov Decision Processes have become a popular means of representing realistic planning problems, exact approaches to finding POMDP policies are extremely computationally complex. An alternative approach for control in POMDP domains is to use run-time optimization over action sequences in a dynamic decision network. While exact algorithms have to generate a policy over the entire belief space, a run-time approach only needs to reason about the reachable parts of the belief space. By combining run-time planning with caching, it is possible to exploit locality of belief states in POMDPS, allowing for significant speedups in plan generation. This paper introduces and evaluates an exact caching mechanism and a grid-based caching mechanism.

This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.