Antecedentes: La validez es uno de los conceptos más fundamentales en el contexto de pruebas educativas y psicológicas y se refiere al grado en el que la evidencia teórica y empírica respaldan las interpretaciones de las puntaciones obtenidas a partir de una prueba utilizada para un fin determinado. En este trabajo, trazamos la historia de la teoría de la validez, centrándonos en su evolución y explicamos cómo validar el uso de una prueba para un propósito determinado. Método: Nos basamos en gran parte en los Estándares para Pruebas Educativas y Psicológicas, propuestos por la American Educational Research Association (AERA), la American Psychological Association (APA) y el National Council on Measurement in Education (NCME), los cuales proporcionan un marco conceptual para la validación de pruebas. También proporcionamos una breve descripción de la validación basada en argumentos y sus componentes, esbozando las dificultades asociadas a la operacionalización del proceso de validación desde una perspectiva de argumentación. Resultados: Se proponen cinco fuentes de evidencias de validez de las puntuaciones de una prueba: contenido, procesos de respuesta, estructura interna, relaciones con otras variables y consecuencias. Conclusión: El uso de los Estándares permite que la evidencia de validez pueda ser acumulada de forma sistemática para respaldar la interpretación y el uso de las puntuaciones de una prueba para un propósito específico, promoviendo así prácticas solidas en cuanto al uso de un instrumento de medida lo cual que puede contribuir a reducir las consecuencias negativas provenientes de la utilización de pruebas de alto riesgo.
This article draws on argument-based validation to gather and evaluate construct-related evidence (i.e., the explanation inference) of a high-stakes test. The data stemmed from the listening component of a French test used for immigration to Canada through the province of Quebec. An expert panel with varied backgrounds in applied linguistics reviewed and associated the items of two operational test forms to four listening comprehension sub-skills identified in selected sources of second language listening theory. Based on the expert panel recommendations, two confirmatory factor models were fit to examinees’ response data. The models fit the data well, providing backing for the explanation inference but suggesting construct under-representation for one of the test forms examined. The argument-based approach to validation yielded principled guidelines to evaluate construct coverage of the test across forms, providing insightful guidance on how to organize construct evidence from an argumentation perspective. Implications are discussed as they relate to the operationalization of argument-based validation in high-stakes settings.
Background. Advances in automated analyses of written discourse have made available a wide range of indices that can be used to better understand linguistic features present in language users’ discourse and the relationships these metrics hold with human raters’ assessments of writing.
Purpose. The present study extends previous research in this area by using the TAALES 2.2 software application to automatically extract 484 single and multi-word metrics of lexical sophistication to examine their relationship with differences in assessed L2 English writing proficiency.
Methods. Using a graded corpus of timed, integrated essays from a major academic English language test, correlations and multiple regressions were used to identify specific metrics that best predict L2 English writing proficiency scores.
Results. The most parsimonious regression model yielded four-predictor variables, with total word count, orthographic neighborhood frequency, lexical decision time, and word naming response time accounting for 36% of total explained variance.
Implications. Results emphasize the importance of writing fluency (by way of total word count) in assessments of this kind. Thus, learners looking to improve writing proficiency may find benefit from writing activities aimed at increasing speed of production. Furthermore, despite a substantial amount of variance explained by the final regression model, findings suggest the need for a wider range of metrics that tap into additional aspects of writing proficiency.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.