Learning Heuristic Functions from Relaxed Plans

Sungwook Yoon, Alan Fern,Robert Givan

We present a novel approach to learning heuristic functions for AI planning domains. Given a state, we view a relaxed plan (RP) found from that state as a relational database, which includes the current state and goal facts, the actions in the RP, and the actionsí add and delete lists. We represent heuristic functions as linear combinations of generic features of the database, selecting features and weights using training data from solved problems in the target planning domain. Many recent competitive planners use RP-based heuristics, but focus exclusively on the length of the RP, ignoring other RP features. Since RP construction ignores delete lists, for many domains, RP length dramatically under-estimates the distance to a goal, providing poor guidance. By using features that depend on deleted facts and other RP properties, our learned heuristics can potentially capture patterns that describe where such under-estimation occurs. Experiments in the STRIPS domains of IPC 3 and 4 show that best-first search using the learned heuristic can outperform FF (Hoffmann & Nebel 2001), which provided our training data, and frequently outperforms the top performances in IPC 4.


This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.