Proceedings:
No. 12: AAAI-21 Technical Tracks 12
Volume
Issue:
Proceedings of the AAAI Conference on Artificial Intelligence, 35
Track:
AAAI Technical Track on Machine Learning V
Downloads:
Abstract:
We study MNL bandits, which is a variant of the traditional multi-armed bandit problem, under risk criteria. Unlike the ordinary expected revenue, risk criteria are more general goals widely used in industries and business. We design algorithms for a broad class of risk criteria, including but not limited to the well-known conditional value-at-risk, Sharpe ratio, and entropy risk, and prove that they suffer a near-optimal regret. As a complement, we also conduct experiments with both synthetic and real data to show the empirical performance of our proposed algorithms.
DOI:
10.1609/aaai.v35i12.17245
AAAI
Proceedings of the AAAI Conference on Artificial Intelligence, 35