We have constructed a two-agent machine learning architecture for intelligent tutoring systems (ITS). The purpose of this architecture is to centralize the reasoning of an ITS into a single component to allow customization of teaching goals and simplify performance improvements. The first agent is responsible for learning a model of how students perform using the tutor in a variety of contexts. The second agent is provided this model of student behavior and a goal specifying the desired educational objective. Reinforcement learning is used by this agent to derive a teaching policy that meets the specified educational goal. Component evaluation studies show each agent performs adequately in isolation. We have also conducted an evaluation with actual students of the complete architecture. Results show our architecture was successful in learning a teaching policy that met the educational objective provided. Although this set of machine learning agents has been integrated with a specific intelligent tutor, the general technique could be applied to a broad class of ITS.