Proceedings:
Vol. 20 (2010): Twentieth International Conference on Automated Planning and Scheduling
Volume
Issue:
Vol. 20 (2010): Twentieth International Conference on Automated Planning and Scheduling
Track:
Short Technical Papers
Downloads:
Abstract:
Upper Confidence bounds applied to Trees (UCT), a bandit-based Monte-Carlo sampling algorithm for planning, has recently been the subject of great interest in adversarial reasoning. UCT has been shown to outperform traditional minimax based approaches in several challenging domains such as Go and Kriegspiel, although minimax search still prevails in other domains such as Chess. This work provides insights into the properties of adversarial search spaces that play a key role in the success or failure of UCT and similar sampling-based approaches. We show that certain "early loss" or "shallow trap" configurations, while unlikely in Go, occur surprisingly often in games like Chess (even in grandmaster games). We provide evidence that UCT, unlike minimax search, is unable to identify such traps in Chess and spends a great deal of time exploring much deeper game play than needed.
DOI:
10.1609/icaps.v20i1.13437
ICAPS
Vol. 20 (2010): Twentieth International Conference on Automated Planning and Scheduling