Proceedings:
Vol. 21 (2011): Twenty-First International Conference on Automated Planning and Scheduling
Volume
Issue:
Vol. 21 (2011): Twenty-First International Conference on Automated Planning and Scheduling
Track:
Full Technical Papers
Downloads:
Abstract:
POMDP algorithms have made significant progress in recent years by allowing practitioners to find good solutions to increasingly large problems. Most approaches (including point-based and policy iteration techniques) operate by refining a lower bound of the optimal value function. Several approaches (e.g., HSVI2, SARSOP, grid-based approaches and online forward search) also refine an upper bound. However, approximating the optimal value function by an upper bound is computationally expensive and therefore tightness is often sacrificed to improve efficiency (e.g., sawtooth approximation). In this paper, we describe a new approach to efficiently compute tighter bounds by i) conducting a prioritized breadth first search over the reachable beliefs, ii) propagating upper bound improvements with an augmented POMDP and iii) using exact linear programming (instead of the sawtooth approximation) for upper bound interpolation. As a result, we can represent the bounds more compactly and significantly reduce the gap between upper and lower bounds on several benchmark problems.
DOI:
10.1609/icaps.v21i1.13467
ICAPS
Vol. 21 (2011): Twenty-First International Conference on Automated Planning and Scheduling