Robust Opponent Modeling via Adversarial Ensemble Reinforcement Learning

Authors

Macheng Shen

Massachusetts Institute of Technology

Jonathan P. How

Massachusetts Institute of Technology

Proceedings:

Book One

Volume

Issue:

Proceedings of the International Conference on Automated Planning and Scheduling, 31

Track:

Special Track on Planning and Learning

Downloads:

Download PDF

Abstract:

This paper studies decision-making in two-player scenarios where the type (e.g. adversary, neutral, or teammate) of the other agent (opponent) is uncertain to the decision-making agent (protagonist), which is an abstraction of security-domain applications. In these settings, the reward for the protagonist agent depends on the type of the opponent, but this is private information known only to the opponent itself, and thus hidden from the protagonist. In contrast, as is often the case, the type of the protagonist agent is assumed to be known to the opponent, and this information-asymmetry significantly complicates the protagonist's decision-making. In particular, to determine the best actions to take, the protagonist agent must infer the opponent type from the observations and agent modeling. To address this problem, this paper presents an opponent-type deduction module based on Bayes' rule. This inference module takes as input the imagined opponent's decision-making rule (opponent model) as well as the observed opponent's history of actions and states, and outputs a belief over the opponent's hidden type. A multiagent reinforcement learning approach is used to develop this game-theoretic opponent model through self-play, which avoids the expensive data collection step that requires interaction with a real opponent. Besides, this multiagent approach also captures the strategy interaction and reasoning between agents. In addition, we apply ensemble training to avoid over-fitting to a single opponent model during the training. As a result, the learned protagonist policy is also effective against unseen opponents. Experimental results show that the proposed game-theoretic modeling, explicit opponent type inference and the ensemble training significantly improves the decision-making performance over baseline approaches, and generalizes well against adversaries that have not been seen during the training.

DOI:

10.1609/icaps.v31i1.16006

ICAPS

Proceedings of the International Conference on Automated Planning and Scheduling, 31

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.