Finding Meaning of Clusters

Yutaka Matsuo and Yukio Ohsawa

Clustering is an important data exploration task in chance discovery as well as in data mining. The first hierarchical clustering dates back to 1951 by K. Florek; since then, there have been numerous algorithms. However, there is no consensus among researchers as to what constitutes a cluster; the choice of the cluster is application-dependent. Although clustering is sometimes evaluated by interpretability of clusters, few studies have been done to reveal the interpretation aspect of clusters. This paper explains development of a new clustering algorithm by graph-based partitioning which aims to simplify interpretation of clusters. Two typical cluster types are considered: a star and a diamond. A star is a cluster with explicit shared context, represented by a central node. A diamond is a cluster with shared context, whose main cause of the context is implicit and hidden. These two types are very easy to understand. We elicit these types of clusters from a given weighted linkage graph. Minimization of weight of the graph cut is also considered. We show some examples and explain the effectiveness of our method.

This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.