MaskGEC: Improving Neural Grammatical Error Correction via Dynamic Masking

  • Zewei Zhao Peking University
  • Houfeng Wang Peking University

Abstract

Grammatical error correction (GEC) is a promising natural language processing (NLP) application, whose goal is to change the sentences with grammatical errors into the correct ones. Neural machine translation (NMT) approaches have been widely applied to this translation-like task. However, such methods need a fairly large parallel corpus of error-annotated sentence pairs, which is not easy to get especially in the field of Chinese grammatical error correction. In this paper, we propose a simple yet effective method to improve the NMT-based GEC models by dynamic masking. By adding random masks to the original source sentences dynamically in the training procedure, more diverse instances of error-corrected sentence pairs are generated to enhance the generalization ability of the grammatical error correction model without additional data. The experiments on NLPCC 2018 Task 2 show that our MaskGEC model improves the performance of the neural GEC models. Besides, our single model for Chinese GEC outperforms the current state-of-the-art ensemble system in NLPCC 2018 Task 2 without any extra knowledge.

Published
2020-04-03
Section
AAAI Technical Track: Applications