2007
DOI: 10.1111/j.1420-9101.2006.01291.x
|View full text |Cite
|
Sign up to set email alerts
|

An unexpected influence of widely used significance thresholds on the distribution of reported P‐values

Abstract: We consider the problematic relationship between publication success and statistical significance in the light of analyses in which we examine the distribution of published probability (P) values across the statistical ‘significance’ range, below the 5% probability threshold. P‐values are often judged according to whether they lie beneath traditionally accepted thresholds (< 0.05, < 0.01, < 0.001, < 0.0001); we examine how these thresholds influence the distribution of reported absolute P‐values in published s… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
56
0
1

Year Published

2012
2012
2021
2021

Publication Types

Select...
4
2
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 44 publications
(57 citation statements)
references
References 27 publications
0
56
0
1
Order By: Relevance
“…When p-hacking operates and no true non-zero effects exist, by contrast, the p-curve is-under the assumption that researchers have modest ambitions and report the first p-value <0.05 they find-expected to be modestly left-skewed, with values close to 0.05 over-represented-this is because researchers choose to report analyses, out of multiple ones, that yield significant values (often barely so) rather than analyses that yield non-significant values (even if barely non-significant; see simulations in [20]). Preceding the development of p-curve, Ridley et al [22] examined the distribution of p-values from biological science papers published in top journals. They found an overrepresentation of p-values just at or below thresholds conventionally used to determine statistical significance.…”
Section: Revisiting Van Dongen and Gangestad (2011) With New Meta-anamentioning
confidence: 99%
“…When p-hacking operates and no true non-zero effects exist, by contrast, the p-curve is-under the assumption that researchers have modest ambitions and report the first p-value <0.05 they find-expected to be modestly left-skewed, with values close to 0.05 over-represented-this is because researchers choose to report analyses, out of multiple ones, that yield significant values (often barely so) rather than analyses that yield non-significant values (even if barely non-significant; see simulations in [20]). Preceding the development of p-curve, Ridley et al [22] examined the distribution of p-values from biological science papers published in top journals. They found an overrepresentation of p-values just at or below thresholds conventionally used to determine statistical significance.…”
Section: Revisiting Van Dongen and Gangestad (2011) With New Meta-anamentioning
confidence: 99%
“…The selection issue has also received a great deal of attention in the medical literature (Berlin et al 1989, Ioannidis 2005, Ridley et al 2007 and in psychological science (Bastardi et al 2011, Fanelli 2010a, Simmons et al 2011). In addition, Auspurg andHinz (2011), Gerber andMalhotra (2008b), Gerber and Malhotra (2008a), Gerber et al (2010) and Masicampo and Lalande (2012) collect distributions of tests in journals in sociology, political science and psychology.…”
Section: Introductionmentioning
confidence: 99%
“…The tacit assumption is that marginally significant p-values are often the result of selective analysis and reporting (also called "p-hacking" or "fiddling"), whereas marginally non-significant results reflect decent unbiased science done by researchers who resist data massaging (Gadbury & Allison, 2012;Gelman & Loken, 2013;Gerber & Malhotra, 2008;Masicampo & Lalande, 2012;Ridley et al, 2007). We also searched for qualitative statements of statistical significance (i.e., "significant difference" vs. "no significant difference") to replicate Pautasso's (2010) method.…”
Section: Introductionmentioning
confidence: 99%