We present results from a meta-analysis of 95 experimental and quasi-experimental pre-K–12 science, technology, engineering, and mathematics (STEM) professional development and curriculum programs, seeking to understand what content, activities, and formats relate to stronger student outcomes. Across rigorously conducted studies, we found an average weighted impact estimate of +0.21 standard deviations. Programs saw stronger outcomes when they helped teachers learn to use curriculum materials; focused on improving teachers’ content knowledge, pedagogical content knowledge, and/or understanding of how students learn; incorporated summer workshops; and included teacher meetings to troubleshoot and discuss classroom implementation. We discuss implications for policy and practice.
The public narrative surrounding efforts to improve low-performing K–12 schools in the United States has been notably gloomy. But what is known empirically about whether school improvement works, which policies are most effective, which contexts respond best to intervention, and how long it takes? We meta-analyze 141 estimates from 67 studies of post–No Child Left Behind Act turnaround policies. On average, policies had moderate positive effects on math and no effect on English Language Arts achievement on high-stakes exams. We find positive impacts on low-stakes exams and no evidence of harm on nontest outcomes. Extended learning time and teacher replacements predict greater effects. Contexts serving majority-Latina/o populations saw the largest improvements. We cannot rule out publication bias entirely but find no differences between peer-reviewed versus nonpeer-reviewed estimates.
How should teachers spend their STEM-focused professional learning time? To answer this question, Heather Hill, Kathleen Lynch, Kathryn Gonzalez, and Cynthia Pollard analyzed a recent wave of rigorous new studies of STEM instructional improvement programs. They found that programs work best when focused on building knowledge teachers can use during instruction. This includes knowledge of the curriculum materials they will use, knowledge of content, and knowledge of how students learn that content. They argue that such learning opportunities improve teachers’ professional knowledge and skill, potentially by supporting teachers in making more informed in-the-moment instructional decisions.
Context In the past two years, states have implemented sweeping reforms to their teacher evaluation systems in response to Race to the Top legislation and, more recently, NCLB waivers. With these new systems, policymakers hope to make teacher evaluation both more rigorous and more grounded in specific job performance domains such as teaching quality and contributions to student outcomes. Attaching high stakes to teacher scores has prompted an increased focus on the reliability and validity of these scores. Teachers unions have expressed strong concerns about the reliability and validity of using student achievement data to evaluate teachers and the potential for subjective ratings by classroom observers to be biased. The legislation enacted by many states also requires scores derived from teacher observations and the overall systems of teacher evaluation to be valid and reliable. Focus of the Study In this paper, we explore how state education officials and their district and local partners plan to implement and evaluate their teacher evaluation systems, focusing in particular on states’ efforts to investigate the reliability and validity of scores emerging from the observational component of these systems. Research Design Through document analysis and interviews with state education officials, we explore several issues that arise in observational systems, including the overall generalizability of teacher scores; the training, certification, and reliability of observers; and specifications regarding the sampling and number of lessons observed per teacher. Findings Respondents’ reports suggest that states are attending to the reliability and validity of scores, but inconsistently; in only a few states does there appear to be a coherent strategy regarding reliability and validity in place. Conclusions There remain a variety of system design and implementation decisions that states can optimize to increase the reliability and validity of their teacher evaluation scores. While a state may engage in auditing scores, for instance, it may miss the gains to reliability and validity that would accrue from periodic rater retraining and recertification, a stiff program of rater monitoring, and the use of multiple raters per teacher. Most troublesome are decisions about which and how many lessons to sample, which are either mandated legislatively, result from practical concerns or negotiations between stakeholders, or, at best case, rest on broad research not directly related to the state context. This suggests that states should more actively investigate the number of lessons and lesson sampling designs required to yield high-quality scores.
Critics of test-based accountability warn that test preparation has a negative influence on teachers' instruction due to a focus on procedural skills. Others advocate that the adoption of more rigorous assessments may be a way to incentivize more ambitious test preparation instruction. Drawing on classroom observations and teacher surveys, we do find that test preparation activities predict lower quality and less ambitious mathematics instruction in upper-elementary classrooms. However, the magnitudes of these relationships appear smaller than the prevailing narrative has warned. Further, our findings call into question the hypothesis that test rigor can serve as a lever to elevate test preparation to ambitious teaching. Therefore, improving the quality of mathematics instruction in the midst of high-stakes testing likely will require that policymakers and school leaders undertake comprehensive efforts that look beyond the tests themselves.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.