Abstract:
Since amounts of unlabelled and high-dimensional data needed to be processed, unsupervised feature selection has become an important and challenging problem in machine learning. Conventional embedded unsupervised methods always need to construct the similarity matrix, which makes the selected features highly depend on the learned structure. However real world data always contain lots of noise samples and features that make the similarity matrix obtained by original data can't be fully relied. We propose an unsupervised feature selection approach which performs feature selection and local structure learning simultaneously, the similarity matrix thus can be determined adaptively. Moreover, we constrain the similarity matrix to make it contain more accurate information of data structure, thus the proposed approach can select more valuable features. An efficient and simple algorithm is derived to optimize the problem. Experiments on various benchmark data sets, including handwritten digit data, face image data and biomedical data, validate the effectiveness of the proposed approach.
DOI:
10.1609/aaai.v30i1.10168