Proceedings:
Machine Learning in Information Access
Volume
Issue:
Papers from the 1996 AAAI Spring Symposium
Track:
Contents
Downloads:
Abstract:
Two methods for learning text classifiers are compared on classification problems that might arise in filtering and filing personM e-mail messages: a "traxiitionM IR" method based on TF-IDF weighting, and a new method for learning sets of "keyword-spotting rules" based on the RIPPER rule learning algorithm. It is demonstrated that both methods obtain significant generalizations from a small number of examples; that both methods are comparable in generalization performance on problems of this type; and that both methods axe reasonably efficient, even with fairly large training sets. However, the greater comprehensibility of the rules may be advantageous in a system that allows users to extend or otherwise modify a learned classifier.
Spring
Papers from the 1996 AAAI Spring Symposium