Alignment-Enhanced Transformer for Constraining NMT with Pre-Specified Translations

  • Kai Song Alibaba Group
  • Kun Wang Soochow University
  • Heng Yu Alibaba Group
  • Yue Zhang Westlake University
  • Zhongqiang Huang Alibaba Group
  • Weihua Luo Alibaba Group
  • Xiangyu Duan Soochow University
  • Min Zhang Soochow University

Abstract

We investigate the task of constraining NMT with pre-specified translations, which has practical significance for a number of research and industrial applications. Existing works impose pre-specified translations as lexical constraints during decoding, which are based on word alignments derived from target-to-source attention weights. However, multiple recent studies have found that word alignment derived from generic attention heads in the Transformer is unreliable. We address this problem by introducing a dedicated head in the multi-head Transformer architecture to capture external supervision signals. Results on five language pairs show that our method is highly effective in constraining NMT with pre-specified translations, consistently outperforming previous methods in translation quality.

Published
2020-04-03
Section
AAAI Technical Track: Natural Language Processing