Out-of-distribution (OOD) detection is important for deploying machine learning models in the real world, where test data from shifted distributions can naturally arise. While a plethora of algorithmic approaches have recently emerged for OOD detection, a critical gap remains in theoretical understanding. In this work, we develop an analytical framework that characterizes and unifies the theoretical understanding for OOD detection. Our analytical framework motivates a novel OOD detection method for neural networks, GEM, which demonstrates both theoretical and empirical superiority. In particular, on CIFAR-100 as in-distribution data, our method outperforms a competitive baseline by 16.57% (FPR95). Lastly, we formally provide provable guarantees and comprehensive analysis of our method, underpinning how various properties of data distribution affect the performance of OOD detection.