Track:
Contents
Downloads:
Abstract:
This paper introduces a simple method for estimating cultural orientation, the affiliations of hypertext documents in a polarized field of discourse. Using a probabilistic model based on cocitation information, two experiments are reported. The first experiment tests the model’s ability to discriminate between left- and right-wing documents about politics. In this context the model is tested on two sets of data, 695 partisan web documents, and 162 political weblogs. Accuracy above 90% is obtained from the cocitation model, outperforming lexically based classifiers at statistically significant levels. In the second experiment, the proposed method is used to classify the home pages of musical artists with respect to their mainstream or "alternative" appeal. For musical artists the model is tested on a set of 515 artist home pages, achieving 88% accuracy.