Debugging a Policy: Automatic Action-Policy Testing in AI Planning

Authors

Marcel Steinmetz,Daniel Fišer,Hasan Ferit Eniser,Patrick Ferber,Timo P. Gros,Philippe Heim,Daniel Höller,Xandra Schuler,Valentin Wüstholz,Maria Christakis,Jörg Hoffmann

Saarland University,Saarland University,MPI-SWS,Saarland University University of Basel,Saarland University,Saarland University,Saarland University,Saarland University,ConsenSys,MPI-SWS,Saarland University German Research Center for Artificial Intelligence (DFKI)

Proceedings:

Book One

Volume

Issue:

Book One

Track:

Main Track

Downloads:

Download PDF

Abstract:

Testing is a promising way to gain trust in neural action policies π. Previous work on policy testing in sequential decision making targeted environment behavior leading to failure conditions. But if the failure is unavoidable given that behavior, then π is not actually to blame. For a situation to qualify as a "bug" in π, there must be an alternative policy π' that does better. We introduce a generic policy testing framework based on that intuition. This raises the bug confirmation problem, deciding whether or not a state is a bug. We analyze the use of optimistic and pessimistic bounds for the design of test oracles approximating that problem. We contribute an implementation of our framework in classical planning, experimenting with several test oracles and with random-walk methods generating test states biased to poor policy performance and/or state novelty. We evaluate these techniques on policies π learned with ASNets. We find that they are able to effectively identify bugs in these π, and that our random-walk biases improve over uninformed baselines.

DOI:

10.1609/icaps.v32i1.19820

ICAPS

Book One

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.