Proceedings:
Machine Learning in Information Access
Volume
Issue:
Papers from the 1996 AAAI Spring Symposium
Track:
Contents
Downloads:
Abstract:
We report on our investigations into topic classification with USENET newsgroups. Our framework is to determine the newsgroup that a new document should be posted to. We train our system by forming "metadocuments" that represent each topic. We discuss our experiments with this method, and provide evidence that choosing particular documents or words to use in these models degrades classification accuracy. We also describe a technique called classification-based retrieval for finding documents similar to a query document.
Spring
Papers from the 1996 AAAI Spring Symposium