AAAI Publications, Thirty-First AAAI Conference on Artificial Intelligence

Font Size: 
Informative Subspace Learning for Counterfactual Inference
Yale Chang, Jennifer G. Dy

Last modified: 2017-02-13


Inferring causal relations from observational data is widely used for knowledge discovery in healthcare and economics. To investigate whether a treatment can affect an outcome of interest, we focus on answering counterfactual questions of this type: what would a patient’s blood pressure be had he/she received a different treatment? Nearest neighbor matching (NNM) sets the counterfactual outcome of any treatment (control) sample to be equal to the factual outcome of its nearest neighbor in the control (treatment) group. Although being simple, flexible and interpretable, most NNM approaches could be easily misled by variables that do not affect the outcome. In this paper, we address this challenge by learning subspaces that are predictive of the outcome variable for both the treatment group and control group. Applying NNM in the learned subspaces leads to more accurate estimation of the counterfactual outcomes and therefore treatment effects. We introduce an informative subspace learning algorithm by maximizing the nonlinear dependence between the candidate subspace and the outcome variable measured by the Hilbert-Schmidt Independence Criterion (HSIC). We propose a scalable estimator of HSIC, called HSIC-RFF that reduces the quadratic computational and storage complexities (with respect to the sample size) of the naive HSIC implementation to linear through constructing random Fourier features. We also prove an upper bound on the approximation error of the HSIC-RFF estimator. Experimental results on simulated datasets and real-world datasets demonstrate our proposed approach outperforms existing NNM approaches and other commonly used regression-based methods for counterfactual inference.


Counterfactual; Causality; Subspace Learning

Full Text: PDF