Structured Sparsification of Gated Recurrent Neural Networks

  • Ekaterina Lobacheva National Research University Higher School of Economics
  • Nadezhda Chirkova National Research University Higher School of Economics
  • Alexander Markovich National Research University Higher School of Economics
  • Dmitry Vetrov National Research University Higher School of Economics

Abstract

One of the most popular approaches for neural network compression is sparsification — learning sparse weight matrices. In structured sparsification, weights are set to zero by groups corresponding to structure units, e. g. neurons. We further develop the structured sparsification approach for the gated recurrent neural networks, e. g. Long Short-Term Memory (LSTM). Specifically, in addition to the sparsification of individual weights and neurons, we propose sparsifying the preactivations of gates. This makes some gates constant and simplifies an LSTM structure. We test our approach on the text classification and language modeling tasks. Our method improves the neuron-wise compression of the model in most of the tasks. We also observe that the resulting structure of gate sparsity depends on the task and connect the learned structures to the specifics of the particular tasks.

Published
2020-04-03
Section
AAAI Technical Track: Machine Learning