In this paper we employ a novel approach to advancing our understanding of the development of writing in English and German children across school grades using classification tasks. The data used come from two recently compiled corpora: The English data come from the the GiC corpus (983 school children in second-, sixth-, ninth-and eleventh-grade) and the German data are from the FD-LEX corpus (930 school children in fifth-and ninthgrade). The key to this paper is the combined use of what we refer to as 'complexity contours', i.e. series of measurements that capture the progression of linguistic complexity within a text, and Recurrent Neural Network (RNN) classifiers that adequately capture the sequential information in those contours. Our experiments demonstrate that RNN classifiers trained on complexity contours achieve higher classification accuracy than one trained on text-average complexity scores. In a second step, we determine the relative importance of the features from four distinct categories through a Sensitivity-Based Pruning approach.
Recent studies have uncovered substantial individual differences in first language (L1) language attainment across the lifespan and across multiple components of language. The existence of such variability raises the question of its role in second language (L2) learning. The existing body of research on L1–L2 relationships has primarily targeted reading comprehension by means of controlled experimental designs. This study extended existing research by investigating L1–L2 relationships in writing through the automatic analysis of linguistic complexity in paired samples of authentic production data. For each writing sample, a series of measurements of 12 indicators was obtained using a computational tool that implements a sliding‐window approach. Results from mixed‐effects modeling revealed significant relationships between L1 complexity and L2 complexity for all but one measure, indicating that an L1 effect is robust across different levels of linguistic description.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.