Heuristic Selection of Actions in Multiagent Reinforcement Learning

Reinaldo A. C. Bianchi, Carlos H. C. Ribeiro, Anna H. R. Costa

This work presents a new algorithm, called Heuristically Accelerated Minimax-Q (HAMMQ), that allows the use of heuristics to speed up the well-known Multiagent Reinforcement Learning algorithm Minimax-Q. A heuristic function that influences the choice of the actions characterises the HAMMQ algorithm. This function is associated with a preference policy that indicates that a certain action must be taken instead of another. A set of empirical evaluations were conducted for the proposed algorithm in a simplified simulator for the robot soccer domain, and experimental results show that even very simple heuristics enhances significantly the performance of the multiagent reinforcement learning algorithm.

Subjects: 12.1 Reinforcement Learning; 7.1 Multi-Agent Systems

Submitted: Oct 15, 2006

This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.