DeepCCFV: Camera Constraint-Free Multi-View Convolutional Neural Network for 3D Object Retrieval

  • Zhengyue Huang Tsinghua University
  • Zhehui Zhao Tsinghua University
  • Hengguang Zhou University of Toronto
  • Xibin Zhao Tsinghua University
  • Yue Gao Tsinghua University

Abstract

3D object retrieval has a compelling demand in the field of computer vision with the rapid development of 3D vision technology and increasing applications of 3D objects. 3D objects can be described in different ways such as voxel, point cloud, and multi-view. Among them, multi-view based approaches proposed in recent years show promising results. Most of them require a fixed predefined camera position setting which provides a complete and uniform sampling of views for objects in the training stage. However, this causes heavy over-fitting problems which make the models failed to generalize well in free camera setting applications, particularly when insufficient views are provided. Experiments show the performance drastically drops when the number of views reduces, hindering these methods from practical applications. In this paper, we investigate the over-fitting issue and remove the constraint of the camera setting. First, two basic feature augmentation strategies Dropout and Dropview are introduced to solve the over-fitting issue, and a more precise and more efficient method named DropMax is proposed after analyzing the drawback of the basic ones. Then, by reducing the over-fitting issue, a camera constraint-free multi-view convolutional neural network named DeepCCFV is constructed. Extensive experiments on both single-modal and cross-modal cases demonstrate the effectiveness of the proposed method in free camera settings comparing with existing state-of-theart 3D object retrieval methods.

Published
2019-07-17