Pruning and Dynamic Scheduling of Cost-Sensitive Ensembles

Authors

Wei Fan

IBM T.J. Watson Research; Fang Chu

University of California

Los Angeles; Haixun Wang and Philip S. Yu

IBM T.J. Watson Research

Proceedings:

Book One

Volume

Issue:

Proceedings of the AAAI Conference on Artificial Intelligence, 18

Track:

Learning

Downloads:

Download PDF

Abstract:

Previous research has shown that averaging ensemble can scale up learning over very large cost-sensitive datasets with linear speedup independent of the learning algorithms. At the same time, it achieves the same or even better accuracy than a single model computed from the entire dataset. However, one major drawback is its inefficiency in prediction since every base model in the ensemble has to be consulted in order to produce a final prediction. In this paper, we propose several approaches to reduce the number of base classifiers. Among various methods explored, our empirical studies have shown that the benefit-based greedy approach can safely remove more than 90% of the base models while maintaining or even exceeding the prediction accuracy of the original ensemble. Assuming that each base classifier consumes one unit of prediction time, the removal of 90% of base classifiers translates to a prediction speedup of 10 times. On top of pruning, we propose a novel dynamic scheduling approach to further reduce the "expected" number of classifiers employed in prediction. It measures the confidence of a prediction by a subset of classifiers in the pruned ensemble. This confidence is used to decide if more classifiers are needed in order to produce a prediction that is the same as the original unpruned ensemble. This approach reduces the "expected" number of classifiers by another 25% to 75% without loss of accuracy.

AAAI

Proceedings of the AAAI Conference on Artificial Intelligence, 18

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.