AAAI Publications, Workshops at the Twenty-Fourth AAAI Conference on Artificial Intelligence

Font Size: 
Learning to Extract Quality Discourse in Online Communities
Michael Robert Brennan, Stacy Wrazien, Rachel Greenstadt

Last modified: 2010-07-07

Abstract


Collaborative filtering systems have been developed to manage information overload and improve discourse in online communities. In such systems, users rank content provided by other users on the validity or usefulness within their particular context. The goal is that "good" content will rise to prominence and "bad" content will fade into obscurity. These filtering mechanisms are not well-understood and have known weaknesses. For example, they depend on the presence of a large crowd to rate content, but such a crowd may not be present. Additionally, the community's decisions determine which voices will reach a large audience and which will be silenced, but it is not known if these decisions represent "the wisdom of crowds" or a "censoring mob." Our approach uses statistical machine learning to predict community ratings. By extracting features that replicate the community's verdict, we can better understand collaborative filtering, improve the way the community uses the ratings of their members, and design agents that augment community decision-making. Slashdot is an example of such a community where peers will rate each others' comments based on their relevance to the post. This work extracts a wide variety of features from the Slashdot metadata and posts' linguistic contents to identify features that can predict the community rating. We find that author reputation, use of pronouns, and author sentiment are salient. We achieve 76% accuracy predicting community ratings as good, neutral, or bad.

Full Text: PDF