AAAI Publications, Thirty-First AAAI Conference on Artificial Intelligence

Font Size: 
Learning Invariant Deep Representation for NIR-VIS Face Recognition
Ran He, Xiang Wu, Zhenan Sun, Tieniu Tan

Last modified: 2017-02-13


Visual versus near infrared (VIS-NIR) face recognition is still a challenging heterogeneous task due to large appearance difference between VIS and NIR modalities. This paper presents a deep convolutional network approach that uses only one network to map both NIR and VIS images to a compact Euclidean space. The low-level layers of this network are trained only on large-scale VIS data. Each convolutional layer is implemented by the simplest case of maxout operator. The high-level layer is divided into two orthogonal subspaces that contain modality-invariant identity information and modality-variant spectrum information respectively. Our joint formulation leads to an alternating minimization approach for deep representation at the training time and an efficient computation for heterogeneous data at the testing time. Experimental evaluations show that our method achieves 94% verification rate at FAR=0.1% on the challenging CASIA NIR-VIS 2.0 face recognition dataset. Compared with state-of-the-art methods, it reduces the error rate by 58% only with a compact 64-D representation.


deep learning; face recognition; heterogeneous; near infrared; CNN

Full Text: PDF