Good generalization capability is an important quality of well-trained and robust neural networks. However, networks usually struggle when faced with samples outside the training distribution. Mixup is a technique that improves generalization, reduces memorization, and increases adversarial robustness. We apply a variant of Mixup called Manifold Mixup to the sentence classification problem, and present the results along with an ablation study. Our methodology outperforms CNN, LSTM, and vanilla BERT models in generalization.
Published Date: 2020-06-02
Registration: ISSN 2374-3468 (Online) ISSN 2159-5399 (Print) ISBN 978-1-57735-835-0 (10 issue set)
Copyright: Published by AAAI Press, Palo Alto, California USA Copyright © 2020, Association for the Advancement of Artificial Intelligence All Rights Reserved