Automatic item generation (AIG) has the potential to greatly expand the number of items for educational assessments, while simultaneously allowing for a more construct-driven approach to item development. However, the traditional item modeling approach in AIG is limited in scope to content areas that are relatively easy to model (such as math problems), and depends on highly skilled content experts to create each model. In this paper we describe the interactive reading task, a transformer-based deep language modeling approach for creating reading comprehension assessments. This approach allows a fully automated process for the creation of source passages together with a wide range of comprehension questions about the passages. The format of the questions allows automatic scoring of responses with high fidelity (e.g., selected response questions). We present the results of a large-scale pilot of the interactive reading task, with hundreds of passages and thousands of questions. These passages were administered as part of the practice test of the Duolingo English Test. Human review of the materials and psychometric analyses of test taker results demonstrate the feasibility of this approach for automatic creation of complex educational assessments.
Many online and mobile applications rely on daily emails and push notifications to increase and maintain user engagement. The multiarmed bandit approach provides a useful framework for optimizing the content of these notifications, but a number of complications (such as novelty effects and conditional eligibility) make conventional bandit algorithms unsuitable in practice. In this paper, we introduce the Recovering Difference Softmax Algorithm to address the particular challenges of this problem domain, and use it to successfully optimize millions of daily reminders for the online language-learning app Duolingo. This lead to a 0.5 % increase in total daily active users (DAUs) and a 2 % increase in new user retention over a strong baseline. We provide technical details of its design and deployment, and demonstrate its efficacy through both offline and online evaluation experiments. CCS CONCEPTS • Computing methodologies → Sequential decision making; • Mathematics of computing → Probability and statistics.
This paper presents the Duolingo English Test’s speaking construct, situated within the Duolingo English Test assessment ecosystem (Burstein et al., 2022). We describe how the Duolingo English Test defines, operationalizes, and measures speaking through various speaking-related item types. The operationalization and measurement of the speaking construct includes the item-type design process and automated item generation processes.
Assessments, especially those used for high-stakes decision making, draw on evidence-based frameworks. Such frameworks inform every aspect of the testing process, from development to results reporting. The frameworks that language assessment professionals use draw on theory in language learning, assessment design, and measurement and psychometrics in order to provide underpinnings for the evaluation of language skills including speaking, writing, reading, and listening. This paper focuses on the construct, or underlying trait, of writing ability. The paper conceptualizes the writing construct for the Duolingo English Test, a digital-first assessment. “Digital-first” includes technology such as artificial intelligence (AI) and machine learning, with human expert involvement, throughout all item development, test scoring, and security processes. This work is situated in the Burstein et al. (2022) theoretical ecosystem for digital-first assessment, the first representation of its kind that incorporates design, validation/measurement, and security all situated directly in assessment practices that are digital first. The paper first provides background information about the Duolingo English Test and then defines the writing construct, including the purposes for writing. It also introduces principles underpinning the design of writing items and illustrates sample items that assess the writing construct.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.