2022
DOI: 10.31234/osf.io/h9mvs
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Evaluation of text-level measures of lexical dispersion: Robustness and consistency

Abstract: The traditional approach to measuring lexical dispersion is to form corpus parts of equal size and then compare the occurrence rate of an item across these units. In recent methodological work, this strategy has met with criticism due to its ignorance to corpus structure. Dispersion, it is argued, should be measured across linguistically meaningful units such as the individual text files constituting the corpus. Though desirable on linguistic grounds, a shift to texts as the unit of analysis raises new methodo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
2

Relationship

2
0

Authors

Journals

citations
Cited by 2 publications
(4 citation statements)
references
References 13 publications
0
4
0
Order By: Relevance
“…In corpus B, instances are more densely clustered, and there are large stretches of text where the item does not occur. In the corpus-linguistic sense, then, the dispersion of the item is higher in corpus A (see Gries 2008Gries , 2020Sönning 2022b). This dot marks how often the item appeared in the text.…”
Section: Dispersion: Corpus-linguistic Vs Statistical Sensementioning
confidence: 99%
See 1 more Smart Citation
“…In corpus B, instances are more densely clustered, and there are large stretches of text where the item does not occur. In the corpus-linguistic sense, then, the dispersion of the item is higher in corpus A (see Gries 2008Gries , 2020Sönning 2022b). This dot marks how often the item appeared in the text.…”
Section: Dispersion: Corpus-linguistic Vs Statistical Sensementioning
confidence: 99%
“…As Table 1 shows, these keyness dimensions allow us to form four linguistically meaningful classes of metrics. For reasons of space, we cannot provide details about the individual measures here, and we refer the reader to Gabrielatos (2018), Rayson & Potts (2020), Gries (2020), and Sönning (2022a, 2022b. The four-way arrangement in Table 1 offers a constructive point of departure for keyness analysis, since it requires the analyst to first consider which features of keyness to emphasize when looking for typical items in the target corpus.…”
Section: Dimensions Of Keynessmentioning
confidence: 99%
“…texts) of different length (cf. Gries 2020; Sönning 2022a). An overview of different dispersion measures is provided in Gries (2020) and Sönning (2022a).…”
Section: Generality: Dispersion In the Target Corpusmentioning
confidence: 99%
“…Gries 2020; Sönning 2022a). An overview of different dispersion measures is provided in Gries (2020) and Sönning (2022a). For an assessment of the generality of an item, we will consider the following measures: D (Juilland et al 1970), D2 (Carroll 1970), Sadj (Rosengren 1972), DP (Gries 2008;Lijffit & Gries 2012;, DA (Wilcox 1973;, and DKL (Gries 2020(Gries , 2021.…”
Section: Generality: Dispersion In the Target Corpusmentioning
confidence: 99%