IPOMDP-Net: A Deep Neural Network for Partially Observable Multi-Agent Planning Using Interactive POMDPs

Authors

  • Yanlin Han University of Illinois at Chicago
  • Piotr Gmytrasiewicz University of Illinois at Chicago

DOI:

https://doi.org/10.1609/aaai.v33i01.33016062

Abstract

This paper introduces the IPOMDP-net, a neural network architecture for multi-agent planning under partial observability. It embeds an interactive partially observable Markov decision process (I-POMDP) model and a QMDP planning algorithm that solves the model in a neural network architecture. The IPOMDP-net is fully differentiable and allows for end-to-end training. In the learning phase, we train an IPOMDP-net on various fixed and randomly generated environments in a reinforcement learning setting, assuming observable reinforcements and unknown (randomly initialized) model functions. In the planning phase, we test the trained network on new, unseen variants of the environments under the planning setting, using the trained model to plan without reinforcements. Empirical results show that our model-based IPOMDP-net outperforms the other state-of-the-art modelfree network and generalizes better to larger, unseen environments. Our approach provides a general neural computing architecture for multi-agent planning using I-POMDPs. It suggests that, in a multi-agent setting, having a model of other agents benefits our decision-making, resulting in a policy of higher quality and better generalizability.

Downloads

Published

2019-07-17

How to Cite

Han, Y., & Gmytrasiewicz, P. (2019). IPOMDP-Net: A Deep Neural Network for Partially Observable Multi-Agent Planning Using Interactive POMDPs. Proceedings of the AAAI Conference on Artificial Intelligence, 33(01), 6062-6069. https://doi.org/10.1609/aaai.v33i01.33016062

Issue

Section

AAAI Technical Track: Multiagent Systems