AAAI Publications, Thirty-First AAAI Conference on Artificial Intelligence

Font Size: 
Simultaneous Clustering and Ensemble
Zhiqiang Tao, Hongfu Liu, Yun Fu

Last modified: 2017-02-12


Ensemble Clustering (EC) has gained a great deal of attention throughout the fields of data mining and machine learning, since it emerged as an effective and robust clustering framework. Typically, EC methods try to fuse multiple basic partitions (BPs) into a consensus one, of which each BP is obtained by performing traditional clustering method on the same dataset. One promising direction for ensemble clustering is to derive pairwise similarity from BPs, and then transform it as a graph partition problem. However, these graph based methods may suffer from an information loss when computing the similarity between data points, because they only utilize the categorical data provided by multiple BPs, yet neglect rich information from raw features. This problem can badly undermine the underlying cluster structure in the original feature space, and thus degrade the clustering performance. In light of this, we propose a novel Simultaneous Clustering and Ensemble (SCE) framework to alleviate such detrimental effect, which employs the similarity matrix from raw features to enhance the co-association matrix summarized by multiple BPs. Two neat closed-form solutions given by eigenvalue decomposition are provided for SCE. Experiments conducted on 16 real-world datasets demonstrate the effectiveness of the proposed SCE over the traditional clustering and state-of-the-art ensemble clustering methods. Moreover, several impact factors that may affect our method are also explored extensively.


Ensemble Clustering; Co-association Matrix; Spectral Clustering

Full Text: PDF