“…We then rescaled the word counts to get the log2 frequency of occurrences per 1 billion words, so higher values indicate higher log frequencies. We got per-word surprisals for each of 4 different language models, covering a range of common architectures: a Kneser-Ney smoothed 5-gram; the long short-term memory recurrent neural network model of Gulordava et al (2018), which we refer to as GRNN; Transformer-XL (Dai et al, 2019); and GPT-2 (Radford et al, n.d.), using 2 We, furthermore, used the R-packages bookdown (Version 0.29; Xie, 2016), brms (Version 2.18.0; Bürkner, 2017Bürkner, , 2018Bürkner, , 2021, broom.mixed (Version 0.2.9.4; Bolker & Robinson, 2022), cowplot (Version 1.1.1; Wilke, 2020), gridExtra (Version 2.3; Auguie, 2017), here (Version 1.0.1; Müller, 2020), kableExtra (Version 1.3.4; Zhu, 2021), lme4 (Version 1.1.31; Bates et al, 2015), mgcv (Wood, 2003(Wood, , 2004Version 1.8.41;Wood, 2011;Wood et al, 2016), mgcViz (Version 0.1.9; Fasiolo et al, 2018), papaja (Version 0.1.1; Aust & Barth, 2022), patchwork (Version 1.1.2; Pedersen, 2022), rticles (Version 0.24.4; Allaire et al, 2022), tidybayes (Version 3.0.2; Kay, 2022), tidymv (Version 3.3.2; Coretta, 2022), and tidyverse (Version 1.3.2; Wickham et al, 2019). lm-zoo (Gauthier et al, 2020).…”