Swaminathan P, Balaraman Ravindran
Co-clustering exploits co-occurrence information, from contingency tables to cluster both rows and columns simultaneously. It has been established that co-clustering produces a better clustering structure as compared to conventional methods of clustering. So far, co-clustering has only been used as a technique for producing hard clusters, which might be inadequate for applications such as document clustering. In this paper, we present an algorithm using the information theoretic approach [1] to generate overlapping (soft) clusters. The algorithm maintains probability membership for every instance to each of the possible clusters and iteratively tunes these membership values. The theoretical formulation of the criterion function is presented first, followed by the actual algorithm. We evaluate the algorithm over document/word co-occurrence information and present experimental results.
Subjects: 12. Machine Learning and Discovery; 13. Natural Language Processing
Submitted: Feb 20, 2008
This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.