Community detection has been extensively studied for various applications, focusing primarily on network topologies. Recent research has started to explore node contents to identify semantically meaningful communities and interpret their structures using selected words. However, links in real networks typically have semantic descriptions, e.g., comments and emails in social media, supporting the notion of communities of links. Indeed, communities of links can better describe multiple roles that nodes may play and provide a richer characterization of community behaviors than communities of nodes. The second issue in community finding is that most existing methods assume network topologies and descriptive contents to be consistent and to carry the compatible information of node group membership, which is generally violated in real networks. These methods are also restricted to interpret one community with one topic. The third problem is that the existing methods have used top ranked words or phrases to label topics when interpreting communities. However, it is often difficult to comprehend the derived topics using words or phrases, which may be irrelevant. To address these issues altogether, we propose a new unified probabilistic model that can be learned by a dual nested expectation-maximization algorithm. Our new method explores the intrinsic correlation between communities and topics to discover link communities robustly and extract adequate community summaries in sentences instead of words for topic labeling at the same time. It is able to derive more than one topical summary per community to provide rich explanations. We present experimental results to show the effectiveness of our new approach, and evaluate the quality of the results by a case study.
Published Date: 2018-02-08
Registration: ISSN 2374-3468 (Online) ISSN 2159-5399 (Print)
Copyright: Published by AAAI Press, Palo Alto, California USA Copyright © 2018, Association for the Advancement of Artificial Intelligence All Rights Reserved.