Many computational problems can be solved by multiple algorithms, with different algorithms fastest for different problem sizes, input distributions, and hardware characteristics. We consider the problem of algorithm selection: dynamically choose an algorithm to attack an instance of a problem with the goal of minimizing the overall execution time. We formulate the problem as a kind of Markov decision process (MDP), and use ideas from reinforcement learning to solve it. The well known Q-learning algorithm is adapted for this case in a way that combines both Monte-Carlo and Temporal Difference methods. Our initial experiments focus on the problem of order statistic selection. The encouraging results reveal the potential of applying learning methods to traditional computational problems.