Addressing the Under-Translation Problem from the Entropy Perspective

Yang Zhao; Jiajun Zhang; Chengqing Zong; Zhongjun He; Hua Wu

doi:10.1609/aaai.v33i01.3301451

Authors

Yang Zhao Institute of Automation, Chinese Academy of Sciences
Jiajun Zhang Institute of Automation Chinese Academy of Sciences
Chengqing Zong Institute of Automation, Chinese Academy of Sciences
Zhongjun He Baidu, Inc.
Hua Wu Baidu, Inc.

DOI:

https://doi.org/10.1609/aaai.v33i01.3301451

Abstract

Neural Machine Translation (NMT) has drawn much attention due to its promising translation performance in recent years. However, the under-translation problem still remains a big challenge. In this paper, we focus on the under-translation problem and attempt to find out what kinds of source words are more likely to be ignored. Through analysis, we observe that a source word with a large translation entropy is more inclined to be dropped. To address this problem, we propose a coarse-to-fine framework. In coarse-grained phase, we introduce a simple strategy to reduce the entropy of highentropy words through constructing the pseudo target sentences. In fine-grained phase, we propose three methods, including pre-training method, multitask method and two-pass method, to encourage the neural model to correctly translate these high-entropy words. Experimental results on various translation tasks show that our method can significantly improve the translation quality and substantially reduce the under-translation cases of high-entropy words.

Addressing the Under-Translation Problem from the Entropy Perspective

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Developed By

Subscription