Proceedings:
Book One
Volume
Issue:
Proceedings of the AAAI Conference on Artificial Intelligence, 17
Track:
Machine Learning and Data Mining
Downloads:
Abstract:
The bias-variance decomposition is a very useful and widely-used tool for understanding machine-learning algorithms. It was originally developed for squared loss. In recent years, several authors have proposed decompositions for zero-one loss, but each has significant shortcomings. In particular, all of these decompositions have only an intuitive relationship to the original squared-loss one. In this paper, we define bias and variance for an arbitrary loss function, and show that the resulting decomposition specializes to the standard one for the squared-loss case, and to a close relative of Kong and Dietterich’s (1995) one for the zero-one case. The same decomposition also applies to variable misclassification costs. We show a number of interesting consequences of the unified definition. For example, Schapire et al.’s (1997) notion of margin can be expressed as a function of the zero-one bias and variance, making it possible to formally relate a classifier ensemble’s generalization error to the base learner’s bias and variance on training examples. Experiments with the unified definition lead to further insights.
AAAI
Proceedings of the AAAI Conference on Artificial Intelligence, 17