With the starting point that implicit human biases are reflected in the statistical regularities of language, it is possible to measure biases in static word embeddings [1]. With recent advances in natural language processing, state-of-the-art neural language models generate dynamic word embeddings dependent on the context in which the word appears. Current methods of measuring social and intersectional biases in these contextualized word embeddings rely on the effect magnitudes of bias in a small set of pre-defined sentence templates. We propose a new comprehensive method, Contextualized Embedding Association Test (CEAT), based on the distribution of 10,000 pooled effect magnitudes of bias in potential embedding variations and a random-effects model, dispensing with templates. Experiments on social and intersectional biases show that CEAT finds evidence of all tested biases and provides comprehensive information on the variability of effect magnitudes of the same bias in different contexts. Furthermore, we develop two methods, Intersectional Bias Detection (IBD) and Emergent Intersectional Bias Detection (EIBD), to automatically identify the intersectional biases and emergent intersectional biases from static word embeddings in addition to measuring them in contextualized word embeddings. We present the first algorithmic bias detection findings on how intersectional group members are associated with unique emergent biases that do not overlap with the biases of their constituent minority identities. IBD achieves an accuracy of 81.6% and 82.7%, respectively, when detecting the intersectional biases of African American females and Mexican American females. EIBD reaches an accuracy of 84.7% and 65.3%, respectively, when detecting the emergent intersectional biases unique to African American females and Mexican American females. The probability of random correct identification in these tasks ranges from 12.2% to 25.5% in IBD and from 1.0% to 25.5% in EIBD.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.