“…Although TLMs were not primarily designed to compute in a human-like way, there are some rea-sons to suspect that they may have the ability to effectively model at least some aspects of human linguistic reasoning: They consistently demonstrate superior performance (at least compared to other LMs) on human-inspired linguistic benchmarks (Wang et al, 2018(Wang et al, , 2019, and they are typically pre-trained using a lengthy process designed to embed deep semantic knowledge, resulting in efficient encoding of semantic relationships Petroni et al, 2019;Davison et al, 2019;. Common optimization tasks for pretraining transformers, such as the masked LM task (Devlin et al, 2018) are quite similar to the word prediction tasks that are known to predict children's performance on other linguistic skills (Borovsky et al, 2012;Neuman et al, 2011;Gambi et al, 2020). Finally, TLMs tend to outperform other LMs in recent work modeling human reading times, eye-tracking data, and other psychological and psycholinguistic phenomena (Merkx and Frank, 2021;Schrimpf et al, 2020b,a;Hao et al, 2020;Bhatia and Richie, 2020;.…”