Robust Gradient-Based Markov Subsampling

Authors

  • Tieliang Gong University of Ottawa
  • Quanhan Xi University of Ottawa
  • Chen Xu University of Ottawa

DOI:

https://doi.org/10.1609/aaai.v34i04.5817

Abstract

Subsampling is a widely used and effective method to deal with the challenges brought by big data. Most subsampling procedures are designed based on the importance sampling framework, where samples with high importance measures are given corresponding sampling probabilities. However, in the highly noisy case, these samples may cause an unstable estimator which could lead to a misleading result. To tackle this issue, we propose a gradient-based Markov subsampling (GMS) algorithm to achieve robust estimation. The core idea is to construct a subset which allows us to conservatively correct a crude initial estimate towards the true signal. Specifically, GMS selects samples with small gradients via a probabilistic procedure, constructing a subset that is likely to exclude noisy samples and provide a safe improvement over the initial estimate. We show that the GMS estimator is statistically consistent at a rate which matches the optimal in the minimax sense. The promising performance of GMS is supported by simulation studies and real data examples.

Downloads

Published

2020-04-03

How to Cite

Gong, T., Xi, Q., & Xu, C. (2020). Robust Gradient-Based Markov Subsampling. Proceedings of the AAAI Conference on Artificial Intelligence, 34(04), 4004-4011. https://doi.org/10.1609/aaai.v34i04.5817

Issue

Section

AAAI Technical Track: Machine Learning