Proceedings:
Cross-Language Text and Speech Retrieval
Volume
Issue:
Papers from the 1997 AAAI Spring Symposium
Track:
Contents
Downloads:
Abstract:
The described approach to text categorization is based on thematic representation of a text. Thematic representation includes nodes of thematically related terms simulating topics of the text and is provided with classes of their importance for the text. Thematic representation is created on the basis of detailed description of the domain and allows to process different types of texts, to use different systems of categories (in various languages) for text categorization, to adapt quickly the system to other formats and types of texts and/or other systems of categories, to categorize texts using several systems of categories simultaneously. The most part of the algorithm is not language-dependent.
Spring
Papers from the 1997 AAAI Spring Symposium