Proceedings:
No. 7: AAAI-22 Technical Tracks 7
Volume
Issue:
Proceedings of the AAAI Conference on Artificial Intelligence, 36
Track:
AAAI Technical Track on Machine Learning II
Downloads:
Abstract:
In safety-critical machine learning applications, it is crucial to defend models against adversarial attacks --- small modifications of the input that change the predictions. Besides rigorously studied $ell_p$-bounded additive perturbations, semantic perturbations (e.g. rotation, translation) raise a serious concern on deploying ML systems in real-world. Therefore, it is important to provide provable guarantees for deep learning models against semantically meaningful input transformations. In this paper, we propose a new universal probabilistic certification approach based on Chernoff-Cramer bounds that can be used in general attack settings. We estimate the probability of a model to fail if the attack is sampled from a certain distribution. Our theoretical findings are supported by experimental results on different datasets.
DOI:
10.1609/aaai.v36i7.20768
AAAI
Proceedings of the AAAI Conference on Artificial Intelligence, 36