“…For example, written data from social media (e.g. Facebook and Twitter posts, Reddit forums) has been used to study variation in English phonology (Eisenstein, 2015), morphology (Illbury, 2020), syntax (Dunn, 2019; Johannsen et al., 2015; Szmrecsanyi et al., 2019) and lexicon (Baker, 2012; Bamman et al., 2014; Eisenstein et al., 2014; Grieve et al., 2018, 2019; Hovy & Johannsen, 2016; Huang et al., 2016; Schmid et al., 2021). Automatisation entails data collection by scraping it from the Internet (or using optical character recognition for printed documents) and then extracting patterns (e.g.…”