Classifying Text Documents using Modular Categories and Linguistically Motivated Indicators

Eleazar Eskin and Matt Bogosian

In this paper we present two improvements to traditional machine learning text classifiers. The first improvement we present is a decomposition of the classification space into several dimensions of categories. This breaks down the categorization problem into smaller more manageable parts. We discuss when decomposition is useful. The second improvement is to incorporate linguistically motivated indicators to supplement the classification. These indicators provide information about the structure of the document which are used to improve the classification accuracy.

This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.