“…Recently, pre-trained language models (PLMs), e.g., BERT (Devlin et al, 2019), RoBERTa (Liu et al, 2019), GPT-2 (Radford et al, 2019) and DialoGPT (Zhang et al, 2020) have been shown to encode and amplify a range of stereotypical biases, such as racism, and sexism (e.g., Kurita et al, 2019a;Dev et al, 2020;Nangia et al, 2020;Lauscher et al, 2021a, inter alia). While such types of biases provide the basis for interesting academic research, e.g., historical analyses (e.g., Garg et al, 2018;Tripodi et al, 2019;Walter et al, 2021, inter alia), stereotyping constitutes a representational harm (Barocas et al, 2017;Blodgett et al, 2020), and can lead in many concrete socio-technical application scenarios to severe ethical issues by reinforcing societal biases (Hovy and Spruit, 2016;Shah et al, 2020;Mehrabi et al, 2021).…”