A system for lexical acquisition is presented where word meanings are represented by clusters of phrase patterns obtained from analysis of a text corpus. A sample of cases, in the form of a concordance of phrases in which a particular word occurs in the text, is used Ibr the basic analysis. Clustering techniques are used to group together cases having similar grammar and/or meaning. This view is that words obtain their meaning from the category describing this clustering of cases. This category is theory-based in that it contains a model to represent the word meaning at an abstract level, whereas the cases provide empirical evidence which confirm or disprove the model. A complex category evolves as more cases are encountered. Each new case matches to an existing category, or may dynamically alter existing categories as needed to account for the new case. An experimental system is presented which includes syntactic and semantic analysis of phrases obtained from text. It uses a hand-built lexicon and grammar to bootstrap a learning process. The ability to dynamically alter category structure through interpretation of new cases is shown as a way to build lexical structure semi-automatically.
Published Date: May 1998
Registration: ISBN 978-1-57735-051-4
Copyright: Published by The AAAI Press, Menlo Park, California