2021
DOI: 10.1075/ijlcr.00019.int
|View full text |Cite
|
Sign up to set email alerts
|

Natural language processing for learner corpus research

Abstract: The term natural language processing (NLP) refers to the use of computer programs to automatically analyze human language. NLP processes range from the (relatively) simple task of splitting character sequences into words and sentences to much more sophisticated (and challenging) tasks such as converting speech sounds into text and annotating texts for syntactic, semantic, and pragmatic features (among others, see Jurafsky & Manning, 2008 for a survey of common NLP processes; and Meurers & Dickinson, 2017 for s… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
13
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
3

Relationship

1
6

Authors

Journals

citations
Cited by 17 publications
(13 citation statements)
references
References 30 publications
0
13
0
Order By: Relevance
“…These methods have the advantage of retrieving combinations efficiently. However, their accuracy with learner writing has only recently started to be documented, so researchers need to exercise caution in their use (e.g., see Huang et al, 2018; and the special issue on working with learner data edited by Kyle (2021)).…”
Section: Definitions Of Collocation and Methods Of Identificationmentioning
confidence: 99%
“…These methods have the advantage of retrieving combinations efficiently. However, their accuracy with learner writing has only recently started to be documented, so researchers need to exercise caution in their use (e.g., see Huang et al, 2018; and the special issue on working with learner data edited by Kyle (2021)).…”
Section: Definitions Of Collocation and Methods Of Identificationmentioning
confidence: 99%
“…Early studies (e.g., Biber, 1988) focused on the analysis of lexical and lexicogrammatical variation across registers (e.g., different spoken and written language use domains). As the subfield of learner corpus research has grown, taggers and parsers have also been used to investigate how second language learners' linguistic patterns change over time (e.g., Crossley & McNamara, 2014;Kyle et al, 2021) and/or differ across proficiency levels (e.g., Biber et al, 2014;Grant & Ginther, 2000;Paquot, 2018).…”
Section: Applied Linguistics Research and Nlpmentioning
confidence: 99%
“…Lexicogrammatical features: A number of studies have investigated the relationship between L2 proficiency and the use of lexicogrammatical features that are common in academic writing such as various types of noun phrase elaboration (e.g., Biber et al, 2014;Grant & Ginther, 2000;Picoral et al, 2021). A related line of research has explored the relationship between characteristics of verb argument construction use and L2 writing proficiency (e.g., Kyle & Crossley, 2017;Kyle et al, 2021). Syntactic complexity: A particularly common use of NLP tools in second language research is the calculation of classic syntactic complexity indices such as mean length of clause and dependent clauses per clause (e.g., Lu, 2010Lu, , 2011 or more fine-grained indices such as the number of dependents per nominal (e.g., Díez-Bedmar & Pérez-Paredes, 2020).…”
Section: Lexical Bigramsmentioning
confidence: 99%
See 1 more Smart Citation
“…The notion of affordance has been sporadically used in research literature to refer to the general idea of 'potentials' in research tools. Kyle (2021), for instance, uses the term to investigate the potentials of specific NLP tools for learner corpus research, while Dobson (2019) uses the term in a somewhat wider sense to critically examine possibilities and limitations of computational methods in the humanities and social sciences at large. However, if used in a too general sense an important element of affordances get lost, namely that affordances are not generic properties but are relative to agent and environment.…”
Section: Affordances Of Digital Research Toolsmentioning
confidence: 99%