Obtaining grant funding from the National Institutes of Health (NIH) is increasingly competitive, as funding success rates have declined over the past decade. To allocate relatively scarce funds, scientific peer reviewers must differentiate the very best applications from comparatively weaker ones. Despite the importance of this determination, little research has explored how reviewers assign ratings to the applications they review and whether there is consistency in the reviewers' evaluation of the same application. Replicating all aspects of the NIH peer-review process, we examined 43 individual reviewers' ratings and written critiques of the same group of 25 NIH grant applications. Results showed no agreement among reviewers regarding the quality of the applications in either their qualitative or quantitative evaluations. Although all reviewers received the same instructions on how to rate applications and format their written critiques, we also found no agreement in how reviewers "translated" a given number of strengths and weaknesses into a numeric rating. It appeared that the outcome of the grant review depended more on the reviewer to whom the grant was assigned than the research proposed in the grant. This research replicates the NIH peer-review process to examine in detail the qualitative and quantitative judgments of different reviewers examining the same application, and our results have broad relevance for scientific grant peer review.
In scientific grant peer review, groups of expert scientists meet to engage in the collaborative decision-making task of evaluating and scoring grant applications. Prior research on grant peer review has established that inter-reviewer reliability is typically poor. In the current study, experienced reviewers for the National Institutes of Health (NIH) were recruited to participate in one of four constructed peer review panel meetings. Each panel discussed and scored the same pool of recently reviewed NIH grant applications. We examined the degree of intra-panel variability in panels' scores of the applications before versus after collaborative discussion, and the degree of inter-panel variability. We also analyzed videotapes of reviewers' interactions for instances of one particular form of discourse-Score Calibration Talk-as one factor influencing the variability we observe. Results suggest that although reviewers within a single panel agree more following collaborative discussion, different panels agree less after discussion, and Score Calibration Talk plays a pivotal role in scoring variability during peer review. We discuss implications of this variability for the scientific peer review process. HHS Public AccessAuthor manuscript Res Eval. Author manuscript; available in PMC 2018 January 01.Published in final edited form as:Res Eval. 2017 January ; 26(1): 1-14. doi:10.1093/reseval/rvw025. Author Manuscript Author ManuscriptAuthor Manuscript Author ManuscriptKeywords peer review; discourse analysis; decision making; collaboration As the primary means by which scientists secure funding for their research programs, grant peer review is a keystone of scientific research. The largest funding agency for biomedical, behavioral, and clinical research in the USA, the National Institutes of Health (NIH), spends more than 80% of its $30.3 billion annual budget on funding research grants evaluated via peer review (NIH 2016). As part of the mechanism by which this money is allocated to scientists, collaborative peer review panels of expert scientists (referred to as 'study sections' at NIH) convene to evaluate grant applications and assign scores that inform later funding decisions by NIH governance. Thus, deepening our understanding of how peer review ostensibly identifies the most promising, innovative research is crucial for the scientific community writ large. The present study builds upon existing work evaluating the reliability of peer review by examining how the discourse practices of reviewers during study section meetings may contribute to low reliability in peer review outcomes.The NIH peer review process is structured around study sections that engender distributed expertise (Brown et al. 1993), as reviewers evaluate applications based on their particular domain(s) of expertise but then share their specialized knowledge with others who have related but distinct expertise. The very structure of study sections thus facilitates what Brown and colleagues (1993) identify as mutual appropriation among grou...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.