Data Scientist


MultiBoosting and Multi-Strategy Ensemble Learning

MultiBoosting [also known as Boost Bagging] combines boosting and bagging, obtaining most of boosting’s superior bias reduction together with most of bagging’s superior variance reduction. MultiBoosting is an example of Multi-Strategy Ensemble Learning.  We have shown that combining ensemble learning techniques can substantially reduce error.

MultiBoostAB is a standard component in Weka.

Publications

Multistrategy Ensemble Learning: Reducing Error by Combining Ensemble Learning Techniques.
Webb, G. I., & Zheng, Z.
IEEE Transactions on Knowledge and Data Engineering, 16(8), 980-991, 2004.
[PDF] [Bibtex] [Abstract]

@Article{WebbZheng04,
Title = {Multistrategy Ensemble Learning: Reducing Error by Combining Ensemble Learning Techniques},
Author = {G.I. Webb and Z. Zheng},
Journal = {{IEEE} Transactions on Knowledge and Data Engineering},
Year = {2004},
Number = {8},
Pages = {980-991},
Volume = {16},
Abstract = {Ensemble learning strategies, especially Boosting and Bagging decision trees, have demonstrated impressive capacities to improve the prediction accuracy of base learning algorithms. Further gains have been demonstrated by strategies that combine simple ensemble formation approaches. In this paper, we investigate the hypothesis that the improvement inaccuracy of multi-strategy approaches to ensemble learning is due to an increase in the diversity of ensemble members that are formed. In addition, guided by this hypothesis, we develop three new multi-strategy ensemble-learning techniques. Experimental results in a wide variety of natural domains suggest that these multi-strategy ensemble-learning techniques are, on average, more accurate than their component ensemble learning techniques},
Address = {Los Alamitos, CA},
Audit-trail = {Due for publication approx July 2004. {IEEE} copyright signed. 28/10/03 No paper posted - link to TKDE site given},
Keywords = {MultiBoosting and Boosting},
Publisher = {{IEEE} Computer Society},
Related = {multiboosting-and-multi-strategy-ensemble-learning}
}
ABSTRACT Ensemble learning strategies, especially Boosting and Bagging decision trees, have demonstrated impressive capacities to improve the prediction accuracy of base learning algorithms. Further gains have been demonstrated by strategies that combine simple ensemble formation approaches. In this paper, we investigate the hypothesis that the improvement inaccuracy of multi-strategy approaches to ensemble learning is due to an increase in the diversity of ensemble members that are formed. In addition, guided by this hypothesis, we develop three new multi-strategy ensemble-learning techniques. Experimental results in a wide variety of natural domains suggest that these multi-strategy ensemble-learning techniques are, on average, more accurate than their component ensemble learning techniques

MultiBoosting: A Technique for Combining Boosting and Wagging.
Webb, G. I.
Machine Learning, 40(2), 159-196, 2000.
[DOI] [Bibtex] [Abstract]

@Article{Webb00a,
Title = {MultiBoosting: A Technique for Combining Boosting and Wagging},
Author = {G. I. Webb},
Journal = {Machine Learning},
Year = {2000},
Number = {2},
Pages = {159-196},
Volume = {40},
Abstract = {MultiBoosting is an extension to the highly successful AdaBoost technique for forming decision committees. MultiBoosting can be viewed as combining AdaBoost with wagging. It is able to harness both AdaBoost's high bias and variance reduction with wagging's superior variance reduction. Using C4.5 as the base learning algorithm, Multi-boosting is demonstrated to produce decision committees with lower error than either AdaBoost or wagging significantly more often than the reverse over a large representative cross-section of UCI data sets. It offers the further advantage over AdaBoost of suiting parallel execution.},
Address = {Netherlands},
Audit-trail = {27/10/03 requested permission to post pp pdf. 28/10/03 Permission granted by Kluwer. PDF posted 30/10/03},
Doi = {10.1023/A:1007659514849},
Keywords = {MultiBoosting and Boosting and Bias-Variance},
Publisher = {Springer},
Related = {multiboosting-and-multi-strategy-ensemble-learning}
}
ABSTRACT MultiBoosting is an extension to the highly successful AdaBoost technique for forming decision committees. MultiBoosting can be viewed as combining AdaBoost with wagging. It is able to harness both AdaBoost's high bias and variance reduction with wagging's superior variance reduction. Using C4.5 as the base learning algorithm, Multi-boosting is demonstrated to produce decision committees with lower error than either AdaBoost or wagging significantly more often than the reverse over a large representative cross-section of UCI data sets. It offers the further advantage over AdaBoost of suiting parallel execution.

Stochastic Attribute Selection Committees with Multiple Boosting: Learning More Accurate and More Stable Classifier Committees.
Zheng, Z., & Webb, G. I.
Lecture Notes in Computer Science 1574: Methodologies for Knowledge Discovery and Data Mining – Proceedings of the Third Pacific-Asia Conference (PAKDD’99), Berlin/Heidelberg, pp. 123-132, 1999.
[PDF] [Bibtex]

ABSTRACT 

Stochastic Attribute Selection Committees.
Zheng, Z., & Webb, G. I.
Lecture Notes in Computer Science Vol. 1502: Advanced Topics in Artificial Intelligence, Selected Papers from the Eleventh Australian Joint Conference on Artificial Intelligence (AI ’98), Berlin, pp. 321-332, 1998.
[PDF] [Bibtex]

ABSTRACT 

Multiple Boosting: A Combination of Boosting and Bagging.
Zheng, Z., & Webb, G. I.
Proceedings of the 1998 International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA’98), pp. 1133-1140, 1998.
[PDF] [Bibtex]

ABSTRACT 

Integrating Boosting and Stochastic Attribute Selection Committees for Further Improving The Performance of Decision Tree Learning.
Zheng, Z., Webb, G. I., & Ting, K. M.
Proceedings of the Tenth IEEE International Conference on Tools with Artificial Intelligence (ICTAI-98), Los Alamitos, CA, pp. 216-223, 1998.
[PDF] [Bibtex] [Abstract]

@InProceedings{ZhengWebbTing98,
Title = {Integrating Boosting and Stochastic Attribute Selection Committees for Further Improving The Performance of Decision Tree Learning},
Author = {Z. Zheng and G. I. Webb and K. M. Ting},
Booktitle = {Proceedings of the Tenth {IEEE} International Conference on Tools with Artificial Intelligence (ICTAI-98)},
Year = {1998},
Address = {Los Alamitos, CA},
Pages = {216-223},
Publisher = {{IEEE} Computer Society Press},
Abstract = {Techniques for constructing classifier committees including boosting and bagging have demonstrated great success, especially boosting for decision tree learning. This type of technique generates several classifiers to form a committee by repeated application of a single base learning algorithm. The committee members vote to decide the final classification. Boosting and bagging create different classifiers by modifying the distribution of the training set. SASC (Stochastic Attribute Selection Committees) uses an alternative approach to generating classifier committees by stochastic manipulation of the set of attributes considered at each node during tree induction, but keeping the distribution of the training set unchanged. We propose a method for improving the performance of boosting. This technique combines boosting and SASC. It builds classifier committees by manipulating both the distribution of the training set and the set of attributes available during induction. In the synergy SASC effectively increases the model diversity of boosting. Experiments with a representative collection of natural domains show that, on average, the combined technique outperforms either boosting or SASC alone in terms of reducing the error rate of decision tree learning.},
Audit-trail = {Available via Citeseer http://citeseer.ist.psu.edu/4952.html},
Keywords = {MultiBoosting and Boosting and Stochastic Attribute Selection Committees},
Location = {Taipei, Taiwan},
Related = {multiboosting-and-multi-strategy-ensemble-learning}
}
ABSTRACT Techniques for constructing classifier committees including boosting and bagging have demonstrated great success, especially boosting for decision tree learning. This type of technique generates several classifiers to form a committee by repeated application of a single base learning algorithm. The committee members vote to decide the final classification. Boosting and bagging create different classifiers by modifying the distribution of the training set. SASC (Stochastic Attribute Selection Committees) uses an alternative approach to generating classifier committees by stochastic manipulation of the set of attributes considered at each node during tree induction, but keeping the distribution of the training set unchanged. We propose a method for improving the performance of boosting. This technique combines boosting and SASC. It builds classifier committees by manipulating both the distribution of the training set and the set of attributes available during induction. In the synergy SASC effectively increases the model diversity of boosting. Experiments with a representative collection of natural domains show that, on average, the combined technique outperforms either boosting or SASC alone in terms of reducing the error rate of decision tree learning.