Uncertainty-Aware Action Advising for Deep Reinforcement Learning Agents

Felipe Leno Da Silva; Pablo Hernandez-Leal; Bilal Kartal; Matthew E. Taylor

doi:10.1609/aaai.v34i04.6036

Authors

Felipe Leno Da Silva University of Sao Paulo
Pablo Hernandez-Leal Borealis AI
Bilal Kartal Borealis AI
Matthew E. Taylor Borealis AI

DOI:

https://doi.org/10.1609/aaai.v34i04.6036

Abstract

Although Reinforcement Learning (RL) has been one of the most successful approaches for learning in sequential decision making problems, the sample-complexity of RL techniques still represents a major challenge for practical applications. To combat this challenge, whenever a competent policy (e.g., either a legacy system or a human demonstrator) is available, the agent could leverage samples from this policy (advice) to improve sample-efficiency. However, advice is normally limited, hence it should ideally be directed to states where the agent is uncertain on the best action to execute. In this work, we propose Requesting Confidence-Moderated Policy advice (RCMP), an action-advising framework where the agent asks for advice when its epistemic uncertainty is high for a certain state. RCMP takes into account that the advice is limited and might be suboptimal. We also describe a technique to estimate the agent uncertainty by performing minor modifications in standard value-function-based RL methods. Our empirical evaluations show that RCMP performs better than Importance Advising, not receiving advice, and receiving it at random states in Gridworld and Atari Pong scenarios.

Uncertainty-Aware Action Advising for Deep Reinforcement Learning Agents

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Developed By

Subscription