Multi-label feature selection has received considerable attentions during the past decade. However, existing algorithms do not attempt to uncover the underlying causal mechanism, and individually solve different types of variable relationships, ignoring the mutual effects between them. Furthermore, these algorithms lack of interpretability, which can only select features for all labels, but cannot explain the correlation between a selected feature and a certain label. To address these problems, in this paper, we theoretically study the causal relationships in multi-label data, and propose a novel Markov blanket based multi-label causal feature selection (MB-MCF) algorithm. MB-MCF mines the causal mechanism of labels and features first, to obtain a complete representation of information about labels. Based on the causal relationships, MB-MCF then selects predictive features and simultaneously distinguishes common features shared by multiple labels and label-specific features owned by single labels. Experiments on real-world data sets validate that MB-MCF could automatically determine the number of selected features and simultaneously achieve the best performance compared with state-of-the-art methods. An experiment in Emotions data set further demonstrates the interpretability of MB-MCF.
Published Date: 2020-06-02
Registration: ISSN 2374-3468 (Online) ISSN 2159-5399 (Print) ISBN 978-1-57735-835-0 (10 issue set)
Copyright: Published by AAAI Press, Palo Alto, California USA Copyright © 2020, Association for the Advancement of Artificial Intelligence All Rights Reserved