We propose to change the default P-value threshold for statistical significance from 0.05 to 0.005 for claims of new discoveries. T he lack of reproducibility of scientific studies has caused growing concern over the credibility of claims of new discoveries based on 'statistically significant' findings. There has been much progress toward documenting and addressing several causes of this lack of reproducibility (for example, multiple testing, P-hacking, publication bias and under-powered studies). However, we believe that a leading cause of non-reproducibility has not yet been adequately addressed: statistical standards of evidence for claiming new discoveries in many fields of science are simply too low. Associating statistically significant findings with P < 0.05 results in a high rate of false positives even in the absence of other experimental, procedural and reporting problems.For fields where the threshold for defining statistical significance for new discoveries is P < 0.05, we propose a change to P < 0.005. This simple step would immediately improve the reproducibility of scientific research in many fields. Results that would currently be called significant but do not meet the new threshold should instead be called suggestive. While statisticians have known the relative weakness of using P ≈ 0.05 as a threshold for discovery and the proposal to lower it to 0.005 is not new 1,2 , a critical mass of researchers now endorse this change.We restrict our recommendation to claims of discovery of new effects. We do not address the appropriate threshold for confirmatory or contradictory replications of existing claims. We also do not advocate changes to discovery thresholds in fields that have already adopted more stringent standards (for example, genomics and high-energy physics research; see the 'Potential objections' section below).We also restrict our recommendation to studies that conduct null hypothesis significance tests. We have diverse views about how best to improve reproducibility, and many of us believe that other ways of summarizing the data, such as Bayes factors or other posterior summaries based on clearly articulated model assumptions, are preferable to P values. However, changing the P value threshold is simple, aligns with the training undertaken by many researchers, and might quickly achieve broad acceptance.
Expert knowledge is used widely in the science and practice of conservation because of the complexity of problems, relative lack of data, and the imminent nature of many conservation decisions. Expert knowledge is substantive information on a particular topic that is not widely known by others. An expert is someone who holds this knowledge and who is often deferred to in its interpretation. We refer to predictions by experts of what may happen in a particular context as expert judgments. In general, an expert-elicitation approach consists of five steps: deciding how information will be used, determining what to elicit, designing the elicitation process, performing the elicitation, and translating the elicited information into quantitative statements that can be used in a model or directly to make decisions. This last step is known as encoding. Some of the considerations in eliciting expert knowledge include determining how to work with multiple experts and how to combine multiple judgments, minimizing bias in the elicited information, and verifying the accuracy of expert information. We highlight structured elicitation techniques that, if adopted, will improve the accuracy and information content of expert judgment and ensure uncertainty is captured accurately. We suggest four aspects of an expert elicitation exercise be examined to determine its comprehensiveness and effectiveness: study design and context, elicitation design, elicitation method, and elicitation output. Just as the reliability of empirical data depends on the rigor with which it was acquired so too does that of expert knowledge.
Error bars commonly appear in figures in publications, but experimental biologists are often unsure how they should be used and interpreted. In this article we illustrate some basic features of error bars and explain how they can help communicate data and assist correct interpretation. Error bars may show confidence intervals, standard errors, standard deviations, or other quantities. Different types of error bars give quite different information, and so figure legends must make clear what error bars represent. We suggest eight simple rules to assist with effective use and interpretation of error bars.
Elicitation of expert opinion is important for risk analysis when only limited data are available. Expert opinion is often elicited in the form of subjective confidence intervals; however, these are prone to substantial overconfidence. We investigated the influence of elicitation question format, in particular the number of steps in the elicitation procedure. In a 3-point elicitation procedure, an expert is asked for a lower limit, upper limit, and best guess, the two limits creating an interval of some assigned confidence level (e.g., 80%). In our 4-step interval elicitation procedure, experts were also asked for a realistic lower limit, upper limit, and best guess, but no confidence level was assigned; the fourth step was to rate their anticipated confidence in the interval produced. In our three studies, experts made interval predictions of rates of infectious diseases (Study 1, n = 21 and Study 2, n = 24: epidemiologists and public health experts), or marine invertebrate populations (Study 3, n = 34: ecologists and biologists). We combined the results from our studies using meta-analysis, which found average overconfidence of 11.9%, 95% CI [3.5, 20.3] (a hit rate of 68.1% for 80% intervals)-a substantial decrease in overconfidence compared with previous studies. Studies 2 and 3 suggest that the 4-step procedure is more likely to reduce overconfidence than the 3-point procedure (Cohen's d = 0.61, [0.04, 1.18]).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.