TY - JOUR AU - Yan, Muheng AU - Lin, Yu-Ru AU - Hwa, Rebecca AU - Mert Ertugrul, Ali AU - Guo, Meiqi AU - Chung, Wen-Ting PY - 2020/05/26 Y2 - 2024/03/28 TI - MimicProp: Learning to Incorporate Lexicon Knowledge into Distributed Word Representation for Social Media Analysis JF - Proceedings of the International AAAI Conference on Web and Social Media JA - ICWSM VL - 14 IS - 1 SE - Full Papers DO - 10.1609/icwsm.v14i1.7339 UR - https://ojs.aaai.org/index.php/ICWSM/article/view/7339 SP - 738-749 AB - <p>Lexicon-based methods and word embeddings are the two widely used approaches for analyzing texts in social media. The choice of an approach can have a significant impact on the reliability of the text analysis. For example, lexicons provide manually curated, domain-specific attributes about a limited set of words, while word embeddings learn to encode some loose semantic interpretations for a much broader set of words. Text analysis can benefit from a representation that offers both the broad coverage of word embeddings and the domain knowledge of lexicons. This paper presents <em>MimicProp</em>, a new graph-mode method that learns a lexicon-aligned word embedding. Our approach improves over prior graph-based methods in terms of its <em>interpretability</em> (i.e., lexicon attributes can be recovered) and <em>generalizability</em> (i.e., new words can be learned to incorporate lexicon knowledge). It also effectively improves the performance of downstream analysis applications, such as text classification.</p> ER -