Bagging is a simple yet effective design which combines multiple single learners to form an ensemble for prediction. Despite its popular usage in many real-world applications, existing research is mainly concerned with studying unstable learners as the key to ensure the performance gain of a bagging predictor, with many key factors remaining unclear. For example, it is not clear when a bagging predictor can outperform a single learner and what is the expected performance gain when different learning algorithms were used to form a bagging predictor. In this paper, we carry out comprehensive empirical studies to evaluate bagging predictors by using 12 different learning algorithms and 48 benchmark data-sets. Our analysis uses robustness and stability decompositions to characterize different learning algorithms, through which we rank all learning algorithms and comparatively study their bagging predictors to draw conclusions. Our studies assert that both stability and robustness are key requirements to ensure the high performance for building a bagging predictor. In addition, our studies demonstrated that bagging is statistically superior to most single base learners, except for KNN and Naïve Bayes (NB). Multi-layer perception (MLP), Naïve Bayes Trees (NBTree), and PART are the learning algorithms with the best bagging performance.