Emu: Enhancing Multilingual Sentence Embeddings with Semantic Specialization

Authors

  • Wataru Hirota Osaka University
  • Yoshihiko Suhara Megagon Labs
  • Behzad Golshan Megagon Labs
  • Wang-Chiew Tan Megagon Labs

DOI:

https://doi.org/10.1609/aaai.v34i05.6301

Abstract

We present Emu, a system that semantically enhances multilingual sentence embeddings. Our framework fine-tunes pre-trained multilingual sentence embeddings using two main components: a semantic classifier and a language discriminator. The semantic classifier improves the semantic similarity of related sentences, whereas the language discriminator enhances the multilinguality of the embeddings via multilingual adversarial training. Our experimental results based on several language pairs show that our specialized embeddings outperform the state-of-the-art multilingual sentence embedding model on the task of cross-lingual intent classification using only monolingual labeled data.

Downloads

Published

2020-04-03

How to Cite

Hirota, W., Suhara, Y., Golshan, B., & Tan, W.-C. (2020). Emu: Enhancing Multilingual Sentence Embeddings with Semantic Specialization. Proceedings of the AAAI Conference on Artificial Intelligence, 34(05), 7935-7943. https://doi.org/10.1609/aaai.v34i05.6301

Issue

Section

AAAI Technical Track: Natural Language Processing