“…As mentioned above, model bias will be a particular issue when considering data from participants belonging to different communities whereby use of identity terms by one group will systematically alter their results; unlike with lexicon-based approaches, it is not always easy to identify which terms will lead to bias without testing with a data set such as ours. There is a significant body of work looking to develop less biased language models for a range of tasks (Dixon et al, 2018; Liang et al, 2020; Schick et al, 2021; Ungless et al, 2022; Webster et al, 2021; Zhao et al, 2018), for example, using counterfactually augmented data (Sen et al, 2021), which researchers with the right technical skills may be able to adopt, where they have access to the original model. However, for those who must rely on third party tools, our findings suggest marginalised individuals continue to be impacted by predictive bias despite the likely use of debiasing strategies, and in particular the least salient identities.…”