We introduce a classification scheme for detecting political bias in long text content such as newspaper opinion articles. Obtaining long text data and annotations at sufficient scale for training is difficult, but it is relatively easy to extract political polarity from tweets through their authorship. We train on tweets and perform inference on articles. Universal sentence encoders and other existing methods that aim to address this domain-adaptation scenario deliver inaccurate and inconsistent predictions on articles, which we show is due to a difference in opinion concentration between tweets and articles. We propose a two-step classification scheme that uses a neutral detector trained on tweets to remove neutral sentences from articles in order to align opinion concentration and therefore improve accuracy on that domain. Our implementation is available for public use at https://knowbias.ml.
Published Date: 2020-06-02
Registration: ISSN 2374-3468 (Online) ISSN 2159-5399 (Print) ISBN 978-1-57735-835-0 (10 issue set)
Copyright: Published by AAAI Press, Palo Alto, California USA Copyright © 2020, Association for the Advancement of Artificial Intelligence All Rights Reserved