Evolution of Reporting<i>P</i>Values in the Biomedical Literature, 1990-2015

Chavalarias, David; Wallach, John R.; Li, Alvin Ho Ting; Ioannidis, John P. A.

doi:10.1001/jama.2016.1952

Cited by 345 publications

(310 citation statements)

References 37 publications

Supporting

Mentioning

275

Contrasting

Unclassified

Order By: Relevance

“…A failure to support some of the pathways may not sufficiently compromise model fit to warrant rejection, particularly if the test focuses on evaluating difference from the null rather than a specified size and direction of the effect. Consistent with calls to focus on effect size rather than statistical significance and null hypothesis significance testing (Trafimow and Rice, 2009;Cumming, 2014;Chavalarias et al, 2016;McShane et al, 2017), researchers would do well to specify an expected effect size (e.g., a small, medium, or large effect based on Cohen's taxonomy of effect sizes), a range of values for the effect, or the smallest effect size of interest, based on previous evidence for each prediction within the model tested (Lakens, 2014). This level of specificity increases the stringency of test of the nomological network and increases its validity as a contribution to evidence in support of, or disconfirming, the model.…”

Section: Use Of Confirmatory Analytic Approaches In Nomological Validitymentioning

confidence: 74%

On nomological validity and auxiliary assumptions: The importance of simultaneously testing effects in social cognitive theories applied to health behavior and some guidelines

Hagger¹,

Gucciardi²,

chatzisarantis³

2017

Preprint

View full text Add to dashboard Cite

Tests of social cognitive theories provide informative data on the factors that relate to health behavior, and the processes and mechanisms involved. In the present article, we contend that tests of social cognitive theories should adhere to the principles of nomological validity, defined as the degree to which predictions in a formal theoretical network are confirmed. We highlight the importance of nomological validity tests to ensure theory predictions can be disconfirmed through observation. We argue that researchers should be explicit on the conditions that lead to theory disconfirmation, and identify any auxiliary assumptions on which theory effects may be conditional. We contend that few researchers formally test the nomological validity of theories, or outline conditions that lead to model rejection and the auxiliary assumptions that may explain findings that run counter to hypotheses, raising potential for 'falsification evasion. ' We present a brief analysis of studies (k = 122) testing four key social cognitive theories in health behavior to illustrate deficiencies in reporting theory tests and evaluations of nomological validity. Our analysis revealed that few articles report explicit statements suggesting that their findings support or reject the hypotheses of the theories tested, even when findings point to rejection. We illustrate the importance of explicit a priori specification of fundamental theory hypotheses and associated auxiliary assumptions, and identification of the conditions which would lead to rejection of theory predictions. We also demonstrate the value of confirmatory analytic techniques, meta-analytic structural equation modeling, and Bayesian analyses in providing robust converging evidence for nomological validity. We provide a set of guidelines for researchers on how to adopt and apply the nomological validity approach to testing health behavior models.

show abstract

Section: Use Of Confirmatory Analytic Approaches In Nomological Validitymentioning

confidence: 74%

On nomological validity and auxiliary assumptions: The importance of simultaneously testing effects in social cognitive theories applied to health behavior and some guidelines

Hagger¹,

Gucciardi²,

chatzisarantis³

2017

Preprint

View full text Add to dashboard Cite

show abstract

“…Although based on relatively small samples of studies (93 in psychology, and 16 in experimental economics, after excluding initial studies with P > 0.05), these numbers are suggestive of the potential gains in reproducibility that would accrue from the new threshold of P < 0.005 in these fields. In biomedical research, 96% of a sample of recent papers claim statistically significant results with the P < 0.05 threshold 10 . However, replication rates were very low 5 for these studies, suggesting a potential for gains by adopting this new standard in these fields as well.…”

Section: Why 0005mentioning

confidence: 99%

Redefine statistical significance

et al. 2017

Self Cite

View full text Add to dashboard Cite

We propose to change the default P-value threshold for statistical significance from 0.05 to 0.005 for claims of new discoveries. T he lack of reproducibility of scientific studies has caused growing concern over the credibility of claims of new discoveries based on 'statistically significant' findings. There has been much progress toward documenting and addressing several causes of this lack of reproducibility (for example, multiple testing, P-hacking, publication bias and under-powered studies). However, we believe that a leading cause of non-reproducibility has not yet been adequately addressed: statistical standards of evidence for claiming new discoveries in many fields of science are simply too low. Associating statistically significant findings with P < 0.05 results in a high rate of false positives even in the absence of other experimental, procedural and reporting problems.For fields where the threshold for defining statistical significance for new discoveries is P < 0.05, we propose a change to P < 0.005. This simple step would immediately improve the reproducibility of scientific research in many fields. Results that would currently be called significant but do not meet the new threshold should instead be called suggestive. While statisticians have known the relative weakness of using P ≈ 0.05 as a threshold for discovery and the proposal to lower it to 0.005 is not new 1,2 , a critical mass of researchers now endorse this change.We restrict our recommendation to claims of discovery of new effects. We do not address the appropriate threshold for confirmatory or contradictory replications of existing claims. We also do not advocate changes to discovery thresholds in fields that have already adopted more stringent standards (for example, genomics and high-energy physics research; see the 'Potential objections' section below).We also restrict our recommendation to studies that conduct null hypothesis significance tests. We have diverse views about how best to improve reproducibility, and many of us believe that other ways of summarizing the data, such as Bayes factors or other posterior summaries based on clearly articulated model assumptions, are preferable to P values. However, changing the P value threshold is simple, aligns with the training undertaken by many researchers, and might quickly achieve broad acceptance.

show abstract

“…What I hadn't anticipated, however, was how the inexorable progression of efforts to improve transparency in research reporting would occupy a principal focus of my activities since I became the editor. What is readily apparent in the nutrition literature is an unrealistic preponderance of positive studies and implausible findings (28,34,94,138,169). Although these might make good copy for reporters and fodder for food faddists, they undermine the credibility of nutrition as a serious science.…”

Section: Edit or Perishmentioning

confidence: 99%

Nutrition from the Inside Out

Bier

2017

Annu. Rev. Nutr.

View full text Add to dashboard Cite

Nearly 50 years ago, I set out to investigate the clinical problem of hypoglycemia in children with illnesses that limited their food intake. My goal was to gather accurate and precise measurable data. At the time, I wasn't interested in nutrition as a discipline defined in its more general or popular sense. To address the specific problem that interested me required development of entirely new methods based on stable, nonradioactive tracers that satisfied the conditions of accuracy and precision. At the time, I had no inclination of the various theoretical and practical problems that would have to be solved to achieve this goal. Some are briefly described here. Nor did I have the slightest idea that developing the field would result in a fundamental change in how human clinical investigation was conducted, with the eventual replacement of radiotracers with stable isotopically labeled ones, even for adult clinical investigation. Additionally, I had no inclination that the original questions would open avenues to much broader questions of practical nutritional relevance. Moreover, only much later as the editor of The American Journal of Clinical Nutrition did I appreciate the policy implications of how nutritional data are presented in the scientific literature. At least in part, less accurate and precise measurements and less than full transparency in reporting nutritional data have resulted in widespread debate about the public policy recommendations and guidelines that are the intended result of collecting the data in the first place. This article provides a personal recollection (with all the known faults of self-reporting and retrospective memory) of the journey that starts with measurement certainty and ends with policy uncertainty.

show abstract

Evolution of ReportingPValues in the Biomedical Literature, 1990-2015

Cited by 345 publications

References 37 publications

On nomological validity and auxiliary assumptions: The importance of simultaneously testing effects in social cognitive theories applied to health behavior and some guidelines

On nomological validity and auxiliary assumptions: The importance of simultaneously testing effects in social cognitive theories applied to health behavior and some guidelines

Redefine statistical significance

Nutrition from the Inside Out

Contact Info

Product

Resources

About