Published:
2020-06-02
Proceedings:
Proceedings of the AAAI Conference on Artificial Intelligence, 34
Volume
Issue:
Vol. 34 No. 10: Issue 10: AAAI-20 Student Tracks
Track:
Student Abstract Track
Downloads:
Abstract:
Exploration efficiency challenges for multi-agent reinforcement learning (MARL), as the policy learned by confederate MARL depends on the interaction among agents. Less informative reward also restricts the learning speed of MARL in comparison with the informative label in supervised learning. This paper proposes a novel communication method which helps agents focus on different exploration subarea to guide MARL to accelerate exploration. We propose a predictive network to forecast the reward of current state-action pair and use the guidance learned by the predictive network to modify the reward function. An improved prioritized experience replay is employed to help agents better take advantage of the different knowledge learned by different agents. Experimental results demonstrate that the proposed algorithm outperforms existing methods in cooperative multi-agent environments.
DOI:
10.1609/aaai.v34i10.7247
AAAI
Vol. 34 No. 10: Issue 10: AAAI-20 Student Tracks
ISSN 2374-3468 (Online) ISSN 2159-5399 (Print) ISBN 978-1-57735-835-0 (10 issue set)
Published by AAAI Press, Palo Alto, California USA Copyright © 2020, Association for the Advancement of Artificial Intelligence All Rights Reserved