Document Routing as Statistical Classification

David Hull, Jan Pedersen, and Hinrich Schutze

In this paper, we compare learning techniques based on statistical classification to traAitional methods of relevance feedback for the document routing problem. We consider three classification techniques which have decision rules that are derived via explicit error minimization: linear discriminaat analysis, logistic regression, and neural networks. We demonstrate that the classifiers perform 10-15% better than relevance feedback via Rocchio expansion for the TREC-2 and TREC-3 routing tasks.

This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.