2021
DOI: 10.1075/ijlcr.20004.jar
|View full text |Cite
|
Sign up to set email alerts
|

How operationalizations of word types affect measures of lexical diversity

Abstract: This study tests three measures of lexical diversity (LD), each using five operationalizations of word types. The measures include MTLD (measure of textual lexical diversity), MTLD-W (moving average MTLD with wrap-around measurement), and MATTR (moving average type-token ratio). Each of these measures is tested with types operationalized as orthographic forms, lemmas using automated POS tags, lemmas using manually corrected POS tags, flemmas (list-based lemmas that do not distinguish between parts of speech), … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
12
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 12 publications
(12 citation statements)
references
References 39 publications
0
12
0
Order By: Relevance
“…We did not carry out an evaluation of the reliability on the POS annotation directly given that POS-taggers have been shown to perform generally well on learner data, provided that texts are manually spell-corrected before being tagged (De Haan, 2000; van Rooy & Schäfer, 2002), which was the case here. Jarvis and Hashimoto (2021), for example, report F -scores above 0.90 for all POS categories in a corpus of L2 English narrative texts, indicating a high level of reliability when compared to human annotation.…”
Section: Methodsmentioning
confidence: 94%
“…We did not carry out an evaluation of the reliability on the POS annotation directly given that POS-taggers have been shown to perform generally well on learner data, provided that texts are manually spell-corrected before being tagged (De Haan, 2000; van Rooy & Schäfer, 2002), which was the case here. Jarvis and Hashimoto (2021), for example, report F -scores above 0.90 for all POS categories in a corpus of L2 English narrative texts, indicating a high level of reliability when compared to human annotation.…”
Section: Methodsmentioning
confidence: 94%
“…This study leads to two possible future directions. First, a comparison that assesses vocabulary development in heavily inflected languages like Hebrew concerning analytical languages, with a focus on differences between lexical units: word forms, lemmas, flemmas, and word families (Jarvis & Hashimoto, 2021). Second, distinguishing between proficiency levels based on the lexical units in such heavily inflected languages.…”
Section: Discussionmentioning
confidence: 99%
“…There are at least two issues that warrant further exploration. The first is the degree to which operationalization of lexical items (lemmatized versus unlemmatized orthographic forms) affects measurements of diversity in Spanish L2 texts (see Jarvis & Hashimoto, 2021, for some explorations into this issue for L2 English texts). The second issue is operationalization of lexical diversity.…”
Section: Discussionmentioning
confidence: 99%