In this paper, we define and apply representational stability analysis (ReStA), an intuitive way of analyzing neural language models. ReStA is a variant of the popular representational similarity analysis (RSA) in cognitive neuroscience. While RSA can be used to compare representations in models, model components, and human brains, ReStA compares instances of the same model, while systematically varying single model parameter.
Language proficiency tests are used to evaluate and compare the progress of language learners. We present an approach for automatic difficulty prediction of C-tests that performs on par with human experts. On the basis of detailed analysis of newly collected data, we develop a model for C-test difficulty introducing four dimensions: solution difficulty, candidate ambiguity, inter-gap dependency, and paragraph difficulty. We show that cues from all four dimensions contribute to C-test difficulty.
We analyze if large language models are able to predict patterns of human reading behavior. We compare the performance of language-specific and multilingual pretrained transformer models to predict reading time measures reflecting natural human sentence processing on Dutch, English, German, and Russian texts. This results in accurate models of human reading behavior, which indicates that transformer models implicitly encode relative importance in language in a way that is comparable to human processing mechanisms. We find that BERT and XLM models successfully predict a range of eye tracking features. In a series of experiments, we analyze the cross-domain and cross-language abilities of these models and show how they reflect human sentence processing.
Language proficiency tests are a useful tool for evaluating learner progress, if the test difficulty fits the level of the learner. In this work, we describe a generalized framework for test difficulty prediction that is applicable to several languages and test types. In addition, we develop two ranking strategies for candidate evaluation inspired by automatic solving methods based on language model probability and semantic relatedness. These ranking strategies lead to significant improvements for the difficulty prediction of cloze tests.
In this paper, we analyse the differences between L1 acquisition and L2 learning and identify four main aspects: input quality and quantity, mapping processes, cross-lingual influence, and reading experience. As a consequence of these differences, we conclude that L1 readability measures cannot be directly mapped to L2 readability. We propose to calculate L2 readability for various dimensions and for smaller units. It is particularly important to account for the cross-lingual influence from the learner’s L1 and other previously acquired languages and for the learner’s higher experience in reading.In our analysis, we focus on lexical readability as it has been found to be the most influential dimension for L2 reading comprehension. We discuss the features frequency, lexical variation, concreteness, polysemy, and context specificity and analyse their impact on L2 readability. As a new feature specific to L2 readability, we propose the cognateness of words with words in languages the learner already knows. A pilot study confirms our assumption that learners can deduce the meaning of new words by their cognateness to other languages.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.