“…Many studies in corpus linguistics aim to analyse lexical differences between corpora of different genres (Tribble, 2000), their regional and diatypic varieties (Oakes & Farrow, 2007), their oral or written modalities (Rayson, Leech & Hodges, 1997), the period of writing (Laviosa, Pagano, Kemppanen & Ji, 2017) or certain sociological characteristics of the speaker or writer, such as gender, age and socio-economic status (Brezina & Meyerhoff, 2014;Marquilhas, 2015), to cite a few examples. This kind of study immediately raises the question of how to decide whether a difference observed when comparing two given corpora (i.e., more occurrences of towards or male in an American English as opposed to a British English corpus) is purely accidental, or whether it reflects a real difference in the way English is used.…”