An Ensemble Technique for Stable Learners with Performance Bounds

Ian Davidson

Ensemble techniques such as bagging and DECORATE exploit the "instability" of learners, such as decision trees, to create a diverse set of models. However, creating a diverse set of models for stable learners such as naïve Bayes is difficult as they are relatively insensitive to training data changes. Furthermore, many popular ensemble techniques do not have a rigorous underlying theory and often provide no insight into how many models to build. We formally define stable learner as having a second order derivative of the posterior density function and propose an ensemble technique specifically for stable learners. Our ensemble technique, bootstrap model averaging, creates a number of bootstrap samples from the training data, builds a model from each and then sums the joint instance and class probability over all models built. We show that for stable learners our ensemble technique for infinite bootstrap samples approximates posterior model averaging (aka the optimal Bayes classifier (OBC)). For finite bootstrap samples we estimate the increase over the OBC error using Chebychev bounds. We empirically illustrate our approach’s usefulness for several stable learners and verify our bound’s correctness.

This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.