A common practice in IV studies is to check for instrument strength, i.e. its association to the treatment, with an F-test from regression. If the F-statistic is above some threshold, usually 10, the instrument is deemed to satisfy one of the three core IV assumptions and used to test for the treatment effect. However, in many cases, the inference on the treatment effect does not take into account the strength test done a priori. In this paper, we show that not accounting for this pretest can severely distort the distribution of the test statistic and propose a method to correct this distortion, producing valid inference. A key insight in our method is to frame the F-test as a randomized convex optimization problem and to leverage recent methods in selective inference. We prove that our method provides conditional and marginal Type I error control. We also extend our method to weak instrument settings. We conclude with a reanalysis of studies concerning the effect of education on earning where we show that not accounting for pre-testing can dramatically alter the original conclusion about education's effects.
| INTRODUCTION
| MotivationInstrumental variables (IV) is a commonly used approach in economics, epidemiology, genetics, and health policy to estimate the effect of an exposure, treatment, or policy on an outcome of interest; see Angrist and Krueger (2001), Robins (2006), andBaiocchi et al. (2014) for overviews. IV methods require finding variables, known as 1