We theoretically analyze the problem of testing for p‐hacking based on distributions of p‐values across multiple studies. We provide general results for when such distributions have testable restrictions (are non‐increasing) under the null of no p‐hacking. We find novel additional testable restrictions for p‐values based on t‐tests. Specifically, the shape of the power functions results in both complete monotonicity as well as bounds on the distribution of p‐values. These testable restrictions result in more powerful tests for the null hypothesis of no p‐hacking. When there is also publication bias, our tests are joint tests for p‐hacking and publication bias. A reanalysis of two prominent data sets shows the usefulness of our new tests.
We analyze what can be learned from tests for p-hacking based on distributions of t-statistics and p-values across multiple studies. We analytically characterize restrictions on these distributions that conform with the absence of p-hacking. This forms a testable null hypothesis and suggests statistical tests for p-hacking. We extend our results to phacking when there is also publication bias, and also consider what types of distributions arise under the alternative hypothesis that researchers engage in p-hacking. We show that the power of statistical tests for detecting p-hacking is low even if p-hacking is quite prevalent.
p-Hacking can undermine the validity of empirical studies. A flourishing empirical literature investigates the prevalence of p-hacking based on the empirical distribution of reported p-values across studies. Interpreting results in this literature requires a careful understanding of the power of methods used to detect different types of p-hacking. We theoretically study the implications of likely forms of p-hacking on the distribution of reported p-values and the power of existing methods for detecting it. Power can be quite low, depending crucially on the particular p-hacking strategy and the distribution of actual effects tested by the studies. We relate the power of the tests to the costs of p-hacking and show that power tends to be larger when p-hacking is very costly.Monte Carlo simulations support our theoretical results.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.