“…Many existing mitigations rely on the ability to detect problematic content -often centred on content written by humans on social media platforms, such as Twitter (e.g. Waseem and Hovy, 2016;Wang et al, 2020;Zampieri et al, 2019Zampieri et al, , 2020Zhang et al, 2020), Facebook (Glavaš et al, 2020;Zampieri et al, 2020), or Reddit (Han and Tsvetkov, 2020;Zampieri et al, 2020). However, of course, conversational systems may not necessarily have the same patterns as social media content (Cercas Curry et al, 2021).…”