Social media platforms have been struggling to moderate at scale. In an effort to better cope with content moderation discussion has turned to the role that automated machine-learning (ML) tools might play. The development of automated systems by social media platforms is a notoriously opaque process and public values that pertain to the common good are at stake within these often-obscured processes. One site in which social values are being negotiated is in the framing of what is considered ‘toxic’ by platforms in the development of automated moderation processes. This study takes into consideration differing notions of toxicity – community, platform and societal by examining three measures of toxicity and community health (the ML tool Perspective API; Reddit’s 2020 Content Policy; and the Sense of Community Index-2) and how they are operationalised in the context of r/MGTOW – an antifeminist group known for its misogyny. Several stages of content analysis were conducted on the top posts and comments in r/MGTOW to examine how these different measures of toxicity operate. This paper provides insight into the logics and technicalities of automated moderation tools, platform governance structures, and frameworks for understanding community metrics to interrogate existing uses of ‘toxicity’ as applied to cultural or social subcommunities online. We make a distinction between two used terms: civility and toxicity. Our analysis points to a tension between current social framings and operationalised notions of ‘toxicity’. We argue that there is a clear distinction between civility and toxicity – incivility is a measure of internal perceptions of harm within a community, whereas toxicity is a measure of the capacity for social harms outside of the bounds of the community. This nuanced understanding will enable more targeted interventions to be developed to destabilise the internal conditions that make groups like r/MGTOW internally ‘healthy’ yet externally toxic.
Over the past two years social media platforms have been struggling to moderate at scale. At the same time, they have come under fire for failing to mitigate the risks of perceived ‘toxic’ content or behaviour on their platforms. In effort to better cope with content moderation, to combat hate speech, ‘dangerous organisations’ and other bad actors present on platforms, discussion has turned to the role that automated machine-learning (ML) tools might play. This paper contributes to thinking about the role and suitability of ML for content moderation on community platforms such as Reddit and Facebook. In particular, it looks at how ML tools operate (or fail to operate) effectively at the intersection between online sentiment within communities and social and platform expectations of acceptable discourse. Through an examination of the r/MGTOW subreddit we problematise current understandings of the notion of ‘tox¬icity’ as applied to cultural or social sub-communities online and explain how this interacts with Google’s Perspective tool.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.