Published Date: 2018-02-08
Registration: ISSN 2374-3468 (Online) ISSN 2159-5399 (Print)
Copyright: Published by AAAI Press, Palo Alto, California USA Copyright © 2018, Association for the Advancement of Artificial Intelligence All Rights Reserved.
Pre-trained distributed word representations have been proven to be useful in various natural language processing (NLP) tasks. However, the geometric basis of word representations and their relations to the representations of word's contexts has not been carefully studied yet. In this study, we first investigate such geometric relationship under a general framework, which is abstracted from some typical word representation learning approaches, and find out that only the directions of word representations are well associated to their context vector representations while the magnitudes are not. In order to make better use of the information contained in the magnitudes of word representations, we propose a hierarchical Gaussian model combined with maximum a posteriori estimation to learn word representations, and extend it to represent polysemous words. Our word representations have been evaluated on multiple NLP tasks, and the experimental results show that the proposed model achieved promising results, comparing to several popular word representations.