The Use of Deep Learning Distributed Representations in the Identification of Abusive Text

Authors

Hao Chen,Susan McKeever,Sarah Jane Delany

Applied Intelligence Research Center Technological University Dublin,Applied Intelligence Research Center Technological University Dublin,Applied Intelligence Research Center Technological University Dublin

Proceedings:

Vol. 13 (2019): Thirteenth International AAAI Conference on Web and Social Media

Volume

Issue:

Vol. 13 (2019): Thirteenth International AAAI Conference on Web and Social Media

Track:

Full Papers

Downloads:

Download PDF

Abstract:

The selection of optimal feature representations is a critical step in the use of machine learning in text classification. Traditional features (e.g. bag of words and n-grams) have dominated for decades, but in the past five years, the use of learned distributed representations has become increasingly common. In this paper, we summarise and present a categorisation of the stateof-the-art distributed representation techniques, including word and sentence embedding models. We carry out an empirical analysis of the performance of the various feature representations using the scenario of detecting abusive comments. We compare classification accuracies across a range of off-the-shelf embedding models using 10 labelled datasets gathered from different social media platforms. Our results show that multi-task sentence embedding models perform best with consistently highest classification results in comparison to other embedding models. We hope our work can be a guideline for practitioners in selecting appropriate features in text classification task, particularly in the domain of abuse detection.

DOI:

10.1609/icwsm.v13i01.3215

ICWSM

Vol. 13 (2019): Thirteenth International AAAI Conference on Web and Social Media

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.