Proceedings:
No. 1: Thirty-First AAAI Conference On Artificial Intelligence
Volume
Issue:
Proceedings of the AAAI Conference on Artificial Intelligence, 31
Track:
AAAI Technical Track: Heuristic Search and Optimization
Downloads:
Abstract:
Nowadays, asynchronous parallel algorithms have received much attention in the optimization field due to the crucial demands for modern large-scale optimization problems. However, most asynchronous algorithms focus on convex problems. Analysis on nonconvex problems is lacking. For the Asynchronous Stochastic Descent (ASGD) algorithm, the best result from (Lian et al., 2015) can only achieve an asymptotic O(frac{1}{epsilon^2}) rate (convergence to the stationary points) on nonconvex problems. In this paper, we study Stochastic Variance Reduced Gradient (SVRG) in the asynchronous setting. We propose the Asynchronous Stochastic Variance Reduced Gradient (ASVRG) algorithm for nonconvex finite-sum problems. We develop two schemes for ASVRG, depending on whether the parameters are updated as an atom or not. We prove that both of the two schemes can achieve linear speed up (a non-asymptotic O(frac{n^frac{2}{3}}{epsilon}) rate to the stationary points) for nonconvex problems when the delay parameter tauleq n^{frac{1}{3}}, where n is the number of training samples. We also establish a non-asymptotic O(frac{n^frac{2}{3}tau^frac{1}{3}}{epsilon}) rate (convergence to the stationary points) for our algorithm without assumptions on tau. This further demonstrates that even with asynchronous updating, SVRG has less number of Incremental First-order Oracles (IFOs) compared with Stochastic Gradient Descent and Gradient Descent. We also experiment on a shared memory multi-core system to demonstrate the efficiency of our algorithm.
DOI:
10.1609/aaai.v31i1.10651
AAAI
Proceedings of the AAAI Conference on Artificial Intelligence, 31