Washback of diagnostic tools targeted to young migrant learners has been an under-researched area in the language assessment field. This paper explores teachers’ perceptions on the Greek Diagnostic Language Assessment (GDLA) tool recently introduced into the SL preparatory classes of the Cyprus primary education. The tool’s implementation coincides with the launch of a new SL curriculum. The objective is fourfold: (1) to examine GDLA’s washback effects on teaching/assessment, (2) to investigate washback’s variability with respect to several contextual variables, (3) to collect feedback on the perceived credibility of the tool, and (4) to reflect on the use of the GDLA tool as a lever of instructional reform in support of curricular innovation. The study employs a mixed-methods approach and draws on (a) quantitative data (questionnaire, 234 informants) and (b) qualitative data (interviews, 6 participants). The results indicate a positive and quite strong washback on teaching and assessment. However, they bring to the surface several misconceptions on the purpose and the implementation of diagnostic assessment, pointing to gaps in the teachers’ assessment literacy. They also bring into play school administration constraints. Finally, they imply that a diagnostic assessment aligned to a context-sensitive curriculum may bind the test to positive washback.
This longitudinal study (2002–2014) investigates the stability of rating characteristics of a large group of raters over time in the context of the writing paper of a national high-stakes examination. The study uses one measure of rater severity and two measures of rater consistency. The results suggest that the rating characteristics of individual raters are not stable. Thus, predictions from one administration to the next are difficult, although not impossible. In fact, as the membership of the group of raters changes from year to year, past data on rating characteristics become less useful. When the membership of the group of raters is retained, the community of raters develops more stable characteristics. However, “cultural shocks” (low retention of raters and large numbers of newcomers) destabilize the rating characteristics of the community and predictions become more difficult. We propose practical measures to increase the stability of rating across time and offer methodological suggestions for more efficient rater effect-related research designs and analyses.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.