Are You Robert or RoBERTa? Deceiving Online Authorship Attribution Models Using Neural Text Generators

Authors

Keenan Jones,Jason R.C. Nurse,Shujun Li

University of Kent,University of Kent,University of Kent

Proceedings:

Vol. 16 (2022): Proceedings of the Sixteenth International AAAI Conference on Web and Social Media

Volume

Issue:

Vol. 16 (2022): Proceedings of the Sixteenth International AAAI Conference on Web and Social Media

Track:

Full Papers

Downloads:

Download PDF

Abstract:

Recently, there has been a rise in the development of powerful pre-trained natural language models, including GPT-2, Grover, and XLM. These models have shown state-of-the-art capabilities towards a variety of different NLP tasks, including question answering, content summarisation, and text generation. Alongside this, there have been many studies focused on online authorship attribution (AA). That is, the use of trained models to identify the authors of online texts. Given the power of natural language models in generating convincing texts, this paper examines the degree to which these language models can generate texts capable of deceiving online AA models. Experimenting with both blog and Twitter data, we utilise GPT-2 language models to generate texts using the existing posts of online users. We then examine whether GPT-2-based text generators are capable of mimicking authorial style to such a degree that they can deceive typical AA models. From this, we find that current AI-based text generators are able to successfully mimic authorship, showing capabilities towards this on both datasets. Our findings, in turn, highlight the current capacity of powerful natural language models to generate original online posts capable of mimicking authorial style sufficiently to deceive popular AA methods. This is a key finding given the proposed role of AA in real-world applications such as spam-detection and the investigation of criminal activity online -- where deceptive texts could be automatically generated to mimic authorship in order to mislead these critical AA systems.

DOI:

10.1609/icwsm.v16i1.19304

ICWSM

Vol. 16 (2022): Proceedings of the Sixteenth International AAAI Conference on Web and Social Media

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.