Proceedings:
Acquiring (and Using) Linguistic (and World) Knowledge for Information Access
Volume
Issue:
Papers from the 2002 AAAI Spring Symposium
Track:
Contents
Downloads:
Abstract:
Latent Semantic Analysis (LSA) is at once a remarkably simple and remarkably effective model of language. Its foundation is the following extreme simplification: The meaning of a passage is assumed to be the sum of the meanings of its contained words (with, of course a special restricted meaning of "meaning" relative to all that has been said about meaning in philosophy, linguistics, and literature.). This simplification allows observed natural language, for example a large corpus of ordinary text to be treated as a set of simultaneous linear equations that can be solved for the average meaning of the words, and consequently the meaning of any passage. The solution technique used by LSA is Singular Value Decomposition (SVD) followed by empirically optimal dimension reduction. The dimension reduction made possible by SVD has the property of inducing continuous-valued similarity relations between every word and every other, including the greater than 98 percent of pairs that never cooccur in a typical training corpus.
Spring
Papers from the 2002 AAAI Spring Symposium