Proceedings:
No. 5: AAAI-22 Technical Tracks 5
Volume
Issue:
Proceedings of the AAAI Conference on Artificial Intelligence, 36
Track:
AAAI Technical Track on Game Theory and Economic Paradigms
Downloads:
Abstract:
We consider a sequential blocked matching (SBM) model where strategic agents repeatedly report ordinal preferences over a set of services to a central planner. The planner's goal is to elicit agents' true preferences and design a policy that matches services to agents in order to maximize the expected social welfare with the added constraint that each matched service can be blocked or unavailable for a number of time periods. Naturally, SBM models the repeated allocation of reusable services to a set of agents where each allocated service becomes unavailable for a fixed duration. We first consider the offline SBM setting, where the strategic agents are aware of their true preferences. We measure the performance of any policy by distortion, the worst-case multiplicative approximation guaranteed by any policy. For the setting with s services, we establish lower bounds of Ω(s) and Ω(√s) on the distortions of any deterministic and randomised mechanisms, respectively. We complement these results by providing approximately truthful, measured by incentive ratio, deterministic and randomised policies based on random serial dictatorship which match our lower bounds. Our results show that there is a significant improvement if one considers the class of randomised policies. Finally, we consider the online SBM setting with bandit feedback where each agent is initially unaware of her true preferences, and the planner must facilitate each agent in the learning of their preferences through the matching of services over time. We design an approximately truthful mechanism based on the explore-then-commit paradigm, which achieves logarithmic dynamic approximate regret.
DOI:
10.1609/aaai.v36i5.20411
AAAI
Proceedings of the AAAI Conference on Artificial Intelligence, 36