Building Semantic Concordances: Disambiguation Versus Annotation

George A. Miller

A semantic concordance has been defined Miller, Leacock, Tengi, and Bunker (1993) as textual corpus and a lexicon so combined that every substantive word in the text is linked its appropriate sense in the lexicon." According to this definition, a semantic concordance can viewed either as a corpus of disambiguated text or as a lexicon in which example sentences available for many definitions. At Princeton we have now had more than two years’ experience trying to build such interconnected databases. We have used Word- Net (Miller, 1990; Miller and Fellbaum, 1991) as the lexical component, and the Brown Corpus (Kucera and Francis, 1967; Francis and Kucera, 1982) as the text. WordNet is a lexical database in which, sets of synonyms represent lexicaiized concepts and semantic relations between words and concepts are represented by bidirectional pointers; the Brown Corpus is a collection of passages (each 2,000 words long) that are representative of published American writing in 1960s. This paper reports some of our successes and explores some of the problems that we have encountered.

This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.