A Robust Adversarial Training Approach to Machine Reading Comprehension

Authors

  • Kai Liu Baidu Inc.
  • Xin Liu Xiamen University
  • An Yang Peking University
  • Jing Liu Baidu Inc.
  • Jinsong Su Xiamen University
  • Sujian Li Peking University
  • Qiaoqiao She Baidu Inc.

DOI:

https://doi.org/10.1609/aaai.v34i05.6357

Abstract

Lacking robustness is a serious problem for Machine Reading Comprehension (MRC) models. To alleviate this problem, one of the most promising ways is to augment the training dataset with sophisticated designed adversarial examples. Generally, those examples are created by rules according to the observed patterns of successful adversarial attacks. Since the types of adversarial examples are innumerable, it is not adequate to manually design and enrich training data to defend against all types of adversarial attacks. In this paper, we propose a novel robust adversarial training approach to improve the robustness of MRC models in a more generic way. Given an MRC model well-trained on the original dataset, our approach dynamically generates adversarial examples based on the parameters of current model and further trains the model by using the generated examples in an iterative schedule. When applied to the state-of-the-art MRC models, including QANET, BERT and ERNIE2.0, our approach obtains significant and comprehensive improvements on 5 adversarial datasets constructed in different ways, without sacrificing the performance on the original SQuAD development set. Moreover, when coupled with other data augmentation strategy, our approach further boosts the overall performance on adversarial datasets and outperforms the state-of-the-art methods.

Downloads

Published

2020-04-03

How to Cite

Liu, K., Liu, X., Yang, A., Liu, J., Su, J., Li, S., & She, Q. (2020). A Robust Adversarial Training Approach to Machine Reading Comprehension. Proceedings of the AAAI Conference on Artificial Intelligence, 34(05), 8392-8400. https://doi.org/10.1609/aaai.v34i05.6357

Issue

Section

AAAI Technical Track: Natural Language Processing