AAAI Publications, The Thirty-First International Flairs Conference

Font Size: 
A Comparison of Reinforcement Learning Methodologies in Two-Party and Three-Party Negotiation Dialogue
Gang Xiao, Kallirroi Georgila

Last modified: 2018-05-10


We use reinforcement learning to learn dialogue policies in a collaborative furniture layout negotiation task. We employ a variety of methodologies (i.e., learning against a simulated user versus co-learning) and algorithms. Our policies achieve the best solution or a good solution to this problem for a variety of settings and initial conditions, including in the presence of noise (e.g., due to speech recognition or natural language understanding errors). Also, our policies perform well even in situations not observed during training. Policies trained against a simulated user perform well while interacting with policies trained through co-learning, and vice versa. Furthermore, policies trained in a two-party setting are successfully applied to a three-party setting, and vice versa.


dialogue policy learning; multi-party negotiation; reinforcement learning

Full Text: PDF