Dictionary Requirements for Text Classification: A Comparison of Three Domains

Ellen Riloff

The type of dictionary required for a natural language processing system depends on both the nature of the task and the domain. For example, an indepth comprehension task probably requires more knowledge than an information retrieval task. Similarly, technical domains are fundamentally different from event-based domains and require different types of lexical knowledge. We explore these issues by comparing the performance of four text classification algorithms that use varying amounts of lexical knowledge. We tested the algorithms on three different domains: terrorism, joint ventures, and microelectronics. We found that the algorithms produced dramatically different results on each domain, suggesting that the nature of the domain strongly influences the types of knowledge required to achieve good performance.

This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.