Learning Complex Patterns for Document Categorization

Markus Junker and Andreas Abecker

Knowledge-based approaches to document categorization make use of well elaborated and powerful pattern languages for manual writing of classification rules. Although such classification patterns have proven useful in many practical applications, algorithms for learning classifiers from examples mostly rely on much simpler representations of classification knowledge. In this paper, we describe a learning algorithm which employs a pattern language similar to languages used for manual rule editing. We focus on the learning of three specific constructs of this pattern language, namely phrases, tolerance matches of words and substring matches of words.

This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.