Proceedings:
Building Lexicons for Machine Translation
Volume
Issue:
Papers from the 1993 AAAI Spring Symposium
Track:
Contents
Downloads:
Abstract:
A new representational scheme for semantic information about words in different languages is introduced. Each word is represented as a vector in a multidimensional space. In order to derive the representations, basis vectors for one language are computed as linear approximations of 5,000 dimensional vectors of cooccurrence counts. Using an aligned corpus, the basis vectors of words occurring close to a target word in one of the languages under consideration are summed to compute the confusion vector of the target word. The paper describes the derivation of the representations for English and French and their application to identifying translation pairs.
Spring
Papers from the 1993 AAAI Spring Symposium