States use standards-based English language proficiency (ELP) assessments to inform relatively high-stakes decisions for English learner (EL) students. Results from these assessments are one of the primary criteria used to determine EL students' level of ELP and readiness for reclassification. The results are also used to evaluate the effectiveness of and funding allocation to district or school programs that serve EL students. In an effort to provide empirical validity evidence for such important uses of ELP assessments, this study focused on examining the constructs of ELP assessments as a fundamental validity issue. Particularly, the study examined the types of language proficiency measured in three sample states' ELP assessments and the relationship between each type of language proficiency and content assessment performance. The results revealed notable variation in the presence of academic and social language in the three ELP assessments. A series of hierarchical linear modeling (HLM) analyses also revealed varied relationships among social language proficiency, academic language proficiency, and content assessment performance. The findings highlight the importance of examining the constructs of ELP assessments for making appropriate interpretations and decisions based on the assessment scores for EL students. Implications for policy and practice are discussed.
Throughout the world, tests are administered to some examinees who are not fully proficient in the language in which they are being tested. It has long been acknowledged that proficiency in the language in which a test is administered often affects examinees' performance on a test. Depending on the context and intended uses for a particular assessment, linguistic proficiency may be relevant to the tested construct and subsequent interpretations, or may be a source of construct-irrelevant variance that undermines accurate interpretation of the test performance of linguistic minorities who are not proficient in the language of the assessment. In this article, we highlight key validity issues to be considered when testing linguistic minorities, regardless of whether language is central or construct-irrelevant. We discuss examples of the different types of studies test users and developers could conduct to evaluate the validity of scores of linguistic minorities. These issues span test development and validation activities. We conclude with a list of critical factors to consider in test development and evaluation whenever linguistic minorities are tested.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.