Track:
Contents
Downloads:
Abstract:
Knowledge-poor corpus-based approaches to natural language processing are attractive in that they do not incur the difficulties associated with complex knowledge bases and real-world inferences. However, these kinds of language processing techniques in isolation often do not suffice for a particular task; for this reason we are interested in finding ways to combine various techniques and improve their results. Accordingly, we conducted experiments to refine the results of an automatic lexical discovery technique by making use of a statistically-based syntactic similarity measure. The discovery program uses lexico-syntactic patterns to find instances of the hyponymy relation in large text bases. Once relations of this sort are found, they should be inserted into an existing lexicon or thesaurus. However, the terms in the relation may have multiple senses, thus hampering automatic placement. In order to address this problem we applied a termsimilarity determination technique to the problem of choosing where, in an existing lexical hierarchy, to install a lexical relation. The union of these two corpus-based methods is promising, although only partially successful in the experiments run so far. Here we report some preliminary results, and make suggestions for how to improve the technique in future.