One of the most important elements of agent performance in multi-agent systems is the ability for an agent to predict how other agents will behave. In many domains there are often different modeling systems already available that one could use to make behavior predictions, but the choice of the best one for a particular domain and a specific set of agents is often unclear. To find the best available prediction, we would like to know which model would perform best in each possible world state of the domain. However, when we have limited resources and each prediction query has a cost we may need to decide which queries to pursue using only estimates of their benefit and cost: metareasoning. To estimate the benefit of the computation, a metareasoner needs a robust measurement of performance quality. In this work we present a metareasoning system that relies on a prediction performance measurement, and we propose a novel model performance measurement that fulfils this need: Weighted Prediction Divergence.