In large e-commerce websites, sellers have been observed to engage in fraudulent behaviour, faking historical transactions in order to receive favourable treatment from the platforms, specifically through the allocation of additional buyer impressions which results in higher revenue for them, but not for the system as a whole. This emergent phenomenon has attracted considerable attention, with previous approaches focusing on trying to detect illicit practices and to punish the miscreants. In this paper, we employ the principles of reinforcement mechanism design, a framework that combines the fundamental goals of classical mechanism design, i.e. the consideration of agents' incentives and their alignment with the objectives of the designer, with deep reinforcement learning for optimizing the performance based on these incentives. In particular, first we set up a deep-learning framework for predicting the sellers' rationality, based on real data from any allocation algorithm. We use data from one of largest e-commerce platforms worldwide and train a neural network model to predict the extent to which the sellers will engage in fraudulent behaviour. Using this rationality model, we employ an algorithm based on deep reinforcement learning to optimize the objectives and compare its performance against several natural heuristics, including the platform's implementation and incentive-based mechanisms from the related literature.
Published Date: 2018-02-08
Registration: ISSN 2374-3468 (Online) ISSN 2159-5399 (Print)
Copyright: Published by AAAI Press, Palo Alto, California USA Copyright © 2018, Association for the Advancement of Artificial Intelligence All Rights Reserved.