“…Through careful analyses of their output, we can assess just how much can be learned from the statistical regularities of the linguistic environment (Futrell et al., 2019; Wilcox, Futrell, & Levy, 2022). Some of this work has already been done in the context of encoder‐only masked language models, such as BERT and its related descendants (Ettinger, 2020; Pandia & Ettinger, 2021; Rogers, Kovaleva, & Rumshisky, 2020). Their failures, such as with semantic coherence or pragmatics (Arehalli, Dillon, & Linzen, 2022; Dou, Forbes, Koncel‐Kedziorski, Smith, & Choi, 2022; McClelland et al., 2020), are also interesting and point to other central tenets of usage‐based theories such as the role of environmental contexts, developmental histories, cognitive machinery, and functional pressures in human language learning and use (Christiansen & Chater, 2022).…”