The recent emergence of data mining as a major application of machine learning has led to increased interest in fast rule induction algorithms. These are able to efficiently process large numbers of examples, under the constraint of still achieving good accuracy. If e is the number of examples, many rule learners have O(e^4) asymptotic time complexity in noisy domains, and C4.5RULES has been empirically observed to sometimes require O(e^3) time. Recent advances have brought this bound down to O(elog^2 e), while maintaining accuracy at the level of C4.5RULES’s (Cohen 1995). Ideally, we would like to have an algorithm capable of inducing accurate rules in time linear in e, without becoming too expensive in other factors. This extended abstract presents such an algorithm.
Registration: ISBN 978-0-262-51091-2
Copyright: August 4-8, 1996, Portland, Oregon. Published by The AAAI Press, Menlo Park, California.