📈 New paper published at LRE (if. 1.7)

May 24, 2025·
Alberto Barrón-Cedeño
Alberto Barrón-Cedeño
· 1 min read
A snapshot of the paper

Overview

This study examines whether the psycholinguistic and demographic characteristics of authors of online texts are correlated with the way harmful language, such as toxicity and hate speech, is judged. We apply artificial intelligence models to two harmful language datasets, Jigsaw’s Special Rater Pool dataset and the Measuring Hate Speech dataset, to generate probabilities for different text aspects, namely inferring demographic information of the author behind the suspicious text in terms of age and gender, as well as the expressed emotions, emotionality, sentiment and communication style. We then perform a statistical regression analysis to examine how these text aspects correlate with the perception of hate speech and toxicity during the annotation process. The study shows that while the frequency of the psycholinguistic text aspects that can be derived from the author’s personality does not differ significantly between harmful and non-harmful classes, the inferred text aspects are statistically associated with the annotators’ perception of harmful language and could potentially influence the way annotators label the texts.

Go further