Heterogeneous Face Recognition (HFR) is a challenging task due to large modality discrepancy as well as insufficient training images in certain modalities. In this paper, we propose a new two-branch network architecture, termed as Residual Compensation Networks (RCN), to learn separated features for different modalities in HFR. The RCN incorporates a residual compensation (RC) module and a modality discrepancy loss (MD loss) into traditional convolutional neural networks. The RC module reduces modal discrepancy by adding compensation to one of the modalities so that its representation can be close to the other modality. The MD loss alleviates modal discrepancy by minimizing the cosine distance between different modalities. In addition, we explore different architectures and positions for the RC module, and evaluate different transfer learning strategies for HFR. Extensive experiments on IIIT-D Viewed Sketch, Forensic Sketch, CASIA NIR-VIS 2.0 and CUHK NIR-VIS show that our RCN outperforms other state-of-the-art methods significantly.