Preserving Ordinal Consensus: Towards Feature Selection for Unlabeled Data

  • Jun Guo Tsinghua University
  • Heng Chang Tsinghua University
  • Wenwu Zhu Tsinghua University

Abstract

To better pre-process unlabeled data, most existing feature selection methods remove redundant and noisy information by exploring some intrinsic structures embedded in samples. However, these unsupervised studies focus too much on the relations among samples, totally neglecting the feature-level geometric information. This paper proposes an unsupervised triplet-induced graph to explore a new type of potential structure at feature level, and incorporates it into simultaneous feature selection and clustering. In the feature selection part, we design an ordinal consensus preserving term based on a triplet-induced graph. This term enforces the projection vectors to preserve the relative proximity of original features, which contributes to selecting more relevant features. In the clustering part, Self-Paced Learning (SPL) is introduced to gradually learn from ‘easy’ to ‘complex’ samples. SPL alleviates the dilemma of falling into the bad local minima incurred by noise and outliers. Specifically, we propose a compelling regularizer for SPL to obtain a robust loss. Finally, an alternating minimization algorithm is developed to efficiently optimize the proposed model. Extensive experiments on different benchmark datasets consistently demonstrate the superiority of our proposed method.

Published
2020-04-03
Section
AAAI Technical Track: AI and the Web