Characterizing Membership Privacy in Stochastic Gradient Langevin Dynamics

  • Bingzhe Wu Peking University
  • Chaochao Chen Ant Financial
  • Shiwan Zhao IBM Research
  • Cen Chen Ant Financial Services Group
  • Yuan Yao HongKong University of Science and Technology
  • Guangyu Sun Peking University
  • Li Wang Ant Financial
  • Xiaolu Zhang Ant Financial Services Group
  • Jun Zhou Ant Financial

Abstract

Bayesian deep learning is recently regarded as an intrinsic way to characterize the weight uncertainty of deep neural networks (DNNs). Stochastic Gradient Langevin Dynamics (SGLD) is an effective method to enable Bayesian deep learning on large-scale datasets. Previous theoretical studies have shown various appealing properties of SGLD, ranging from the convergence properties to the generalization bounds. In this paper, we study the properties of SGLD from a novel perspective of membership privacy protection (i.e., preventing the membership attack). The membership attack, which aims to determine whether a specific sample is used for training a given DNN model, has emerged as a common threat against deep learning algorithms. To this end, we build a theoretical framework to analyze the information leakage (w.r.t. the training dataset) of a model trained using SGLD. Based on this framework, we demonstrate that SGLD can prevent the information leakage of the training dataset to a certain extent. Moreover, our theoretical analysis can be naturally extended to other types of Stochastic Gradient Markov Chain Monte Carlo (SG-MCMC) methods. Empirical results on different datasets and models verify our theoretical findings and suggest that the SGLD algorithm can not only reduce the information leakage but also improve the generalization ability of the DNN models in real-world applications.

Published
2020-04-03
Section
AAAI Technical Track: Machine Learning