“…We use the term toxic content as an umbrella for identitybased attacks such as anti-Semitism or racism posted publicly to social media [19,37,55], bullying in online gaming or replies to posts [35,50], trolling [8], threats of violence, sexual harassment, and more [47,52]. These attacks represent just a subset of abuse stemming from hate and harassment, a much broader threat that encompasses any activity where an attacker attempts to inflict emotional harm on a target (e.g., stalking, doxxing, sextortion, and intimate partner violence) [9,52]. Unlike spam, phishing, or related abuse classification problems that can rely on expert raters, toxic content is an inherently subjective problem as we show in our work.…”