“…The quality of the extracted data and the validity of the research results often depend on such parameters of corpus design as size (how large a corpus is), content (what types of and how many texts or text samples are included), representativeness, which is achieved through balancing (what types of genres and in what proportions are included) and sampling techniques (does the corpus include whole texts or text fragments) (Sinclair 1991(Sinclair , 1996Biber, Conrad & Reppen 1998;Kennedy 1998;McEnery & Wilson 2001;McEnery & Gabrielatos 2006;Meyer 2002;Zanettin 2002aZanettin , 2000bZanettin 2011;Baker 2006;Rundell 2008;Anthony 2009;King 2009;Schäfer, Barbaresi & Bildhauer 2013;Losey-Leon 2015). Other parameters that can also be considered are the number of texts, period coverage (or time frame), authorship (e.g., material produced by native and nonnative speakers), and the source of the material (open-access, restricted, copyright, etc.).…”