AAAI Publications, Thirty-First AAAI Conference on Artificial Intelligence

Font Size: 
Infinitely Many-Armed Bandits with Budget Constraints
Haifang Li, Yingce Xia

Last modified: 2017-02-13


We study the infinitely many-armed bandit problem with budget constraints, where the number of arms can be infinite and much larger than the number of possible experiments. The player aims at maximizing his/her total expected reward under a budget constraint B for the cost of pulling arms. We introduce a weak stochastic assumption on the ratio of expected-reward to expected-cost of a newly pulled arm which characterizes its probability of being a near-optimal arm. We propose an algorithm named RCB-I to this new problem, in which the player first randomly picks K arms, whose order is sub-linear in terms of B, and then runs the algorithm for the finite-arm setting on the selected arms. Theoretical analysis shows that this simple algorithm enjoys a sub-linear regret in term of the budget B. We also provide a lower bound of any algorithm under Bernoulli setting. The regret bound of RCB-I matches the lower bound up to a logarithmic factor. We further extend this algorithm to the any-budget setting (i.e., the budget is unknown in advance) and conduct corresponding theoretical analysis.


Budgeted Multi-Armed Bandits; Infinitely Many Arms; Regret Analysis

Full Text: PDF