University entrance language tests are often administered under the assumption that even if language proficiency does not determine academic success, a certain proficiency level is still required. Nevertheless, little research has focused on how well L2 students cope with the linguistic demands of their studies in the first months after passing an entrance test. Even fewer studies have taken a longitudinal perspective.Set in Flanders, Belgium, this study examines the opinions and experiences of 24 university staff members and 31 international L2 students, of whom 20 were tracked longitudinally. Attention is also given to test/retest results, academic score sheets, and class recordings. To investigate the validity of inferences made on the basis of L2 students' scores, Kane's (2013) Interpretation/ Use Argument approach is adopted, and principles from political philosophy are applied to investigate whether a policy that discriminates among students based on language test results can be considered just. It is concluded that the receptive language requirements of university studies exceed the expected B2 level and that the Flemish entrance tests include language tasks that are of little importance for first-year students. Furthermore, some of the students who failed the entrance test actually managed quite well in their studies -a result that entails broad implications concerning validation and justice even outside the study's localized setting.
Research in the field of Language Assessment Literacy (LAL) shows that university admission officers and policy makers are not generally well-versed in matters of LAL. Only very few studies to date have traced why this may be the case, however, and in the field of language testing few studies to date have reported on how university admission language requirements are set. Nevertheless, because of the impact of test use on university admissions, developing such knowledge is essential to the progress of LAL as a discipline. This paper reports on a qualitative study that includes all university admission policy makers in one context (Flanders, Belgium). The analyses of the interviews show that the concerns and ideas of LAL scholars and those of university admission policy makers may differ substantially. Real-world policy is determined by pragmatism and compromise and policy makers, even at universities, may fail to consider empirical findings. Because this study shows that the view of policy makers can be quite dissimilar from the traditional approach taken in the LAL literature, the authors argue that it may be as beneficial to encourage policy literacy among language testing professionals, as to expect LAL from policy makers.
Considering scoring validity as encompassing both reliable rating scale use and valid descriptor interpretation, this study reports on the validation of a CEFR-based scale that was co-constructed and used by novice raters. The research questions this paper wishes to answer are (a) whether it is possible to construct a CEFR-based rating scale with novice raters that yields reliable ratings and (b) allows for a uniform interpretation of the descriptors. Additionally, this study focuses on the question whether co-constructing a rating scale with novice raters helps to stimulate a shared interpretation of the descriptors over time. For this study, six novice raters employed a CEFR-based scale that had been co-constructed by themselves and 14 peers to rate 200 spoken and written performances in a missing data design. The quantitative data were analysed using item response theory, classical test theory and principal component analysis. The focus group data, collected after the rating process, were transcribed and coded using both a priori and inductive coding. The results indicate that novice raters can reliably use the CEFR-based rating scale, but that the interpretations of the descriptors, in spite of training and co-construction, are not as homogeneous as the inter-rater reliability would suggest.
This chapter first outlines the theoretical rationale behind task‐based language assessment (TBLA) and discusses why it remains a contested domain within language testing. The construct and key features of tasks in TBLA are reviewed, with particular attention to the use of TBLA for both summative and formative purposes. Both these purposes are illustrated by means of two extended case studies. The first of these focuses on formative TBLA in the context of Dutch‐speaking primary education. It shows how the rating scales and parameters for analyzing task performance in TBLA help the teacher in fulfilling his or her pedagogic responsibilities. Summative use of TBLA is discussed in the second case study about a centralized task‐based test of Dutch. It centers around such topics as defining the target language use, performing a needs analysis, and making inferences based on task performance.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.