“…Many projects have proposed particular measurement templates, or prompts for the purpose of measuring bias, usually for large language models, (Rudinger et al, 2018;May et al, 2019;Sheng et al, 2019;Kurita et al, 2019;Webster et al, 2020;Gehman et al, 2020;Huang et al, 2020;Vig et al, 2020;Kirk et al, 2021a;Perez et al, 2022), and some even select existing sentences from text sources and swap demographic terms heuristically (Zhao et al, 2019;Wang et al, 2021;Papakipos and Bitton, 2022). Since one of our main contributions is the participatory assembly of a large set of demographic terms, our terms can be slotted into basically any templates to measure imbalances across demographic groups.…”