Multi-Level Head-Wise Match and Aggregation in Transformer for Textual Sequence Matching

Authors

  • Shuohang Wang Singapore Management University
  • Yunshi Lan Singapore Management University
  • Yi Tay Nanyang Technological University
  • Jing Jiang Singapore Management University
  • Jingjing Liu Microsoft Dynamics 365 AI Research

DOI:

https://doi.org/10.1609/aaai.v34i05.6458

Abstract

Transformer has been successfully applied to many natural language processing tasks. However, for textual sequence matching, simple matching between the representation of a pair of sequences might bring in unnecessary noise. In this paper, we propose a new approach to sequence pair matching with Transformer, by learning head-wise matching representations on multiple levels. Experiments show that our proposed approach can achieve new state-of-the-art performance on multiple tasks that rely only on pre-computed sequence-vector-representation, such as SNLI, MNLI-match, MNLI-mismatch, QQP, and SQuAD-binary.

Downloads

Published

2020-04-03

How to Cite

Wang, S., Lan, Y., Tay, Y., Jiang, J., & Liu, J. (2020). Multi-Level Head-Wise Match and Aggregation in Transformer for Textual Sequence Matching. Proceedings of the AAAI Conference on Artificial Intelligence, 34(05), 9209-9216. https://doi.org/10.1609/aaai.v34i05.6458

Issue

Section

AAAI Technical Track: Natural Language Processing