In this paper, we focus on delivering reliable learning results for high stakes applications such as self-driving, financial investment and clinical diagnosis, where the accuracy of predictions is considered as a more crucial requirement than giving predictions for all query samples. We adopt the learning with reject option framework where the learning model only predict those samples which they convince to give the correct answer. However, for most prevailing deep learning predictors, the confidence estimated by the model themselves are far from reflecting the real generalization performance. To model the reliability of prediction concisely, we propose an exploratory solution called GALVE (Generative Adversarial Learning with Variance Expansion) which adopts generative adversarial learning to implicitly measure the region where the model achieve good generalization performance. By applying GALVE to measure the reliability of predictions, we achieved an error rate less than half of which straightforwardly measured by confidence in CIFAR10 and SVHN computer vision tasks.