A Backward Adjusting Strategy and Optimization of the C4.5 Parameters to Improve C4.5's Performance

Authors

Jason Beck

Maria Garcia

Mingyu Zhong

Michael Georgiopoulos

Georgios C. Anagnostopoulos

Track:

All Papers

Downloads:

Download PDF

Abstract:

In machine learning, decision trees are employed extensively in solving classification problems. In order to design a decision tree classifier two main phases are employed. The first phase is to grow the tree using a set of data, called training data, quite often to its maximum size. The second phase is to prune the tree. The pruning phase produces a smaller tree with better generalization (smaller error on unseen data). One of the most popular decision tree classifiers introduced in the literature is the C4.5 decision tree classifier. In this paper, we introduce an additional phase, called adjustment phase, interjected between the growing and pruning phases of the C4.5 decision tree classifier. The intent of this adjustment phase is to reduce the C4.5 error rate by making adjustments to the non-optimal splits created in the growing phase of the C4.5 classifier, thus eventually improving generalization (accuracy of the tree on unseen data). In most of the simulations conducted with the C4.5 decision tree classifier, its parameters, confidence factor, CF, and minimum number of split-off cases, MS, are chosen to be equal 25% and 2, their default values, recommended by Quinlan, the inventor of C4.5. The overall value of this work is that it provides the C4.5 user with a quantitative and qualitative assessment of the benefits of the proposed adjust phase, as well as the benefits of optimizing the C4.5 parameters, CF and MS.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.