Polygenic scores have become the workhorse for empirical analyses in social-science genetics. Because a polygenic score is constructed using the results of finite-sample Genome-Wide Association Studies (GWASs), it is a noisy approximation of the true latent genetic predisposition to a certain trait. The conventional way of boosting the predictive power of polygenic scores is to increase the GWAS sample size by meta-analyzing GWAS results of multiple cohorts. In this paper we challenge this convention. Through simulations, we show that Instrumental Variable (IV) regression using two polygenic scores from independent GWAS samples outperforms the typical Ordinary Least Squares (OLS) model employing a single meta-analysis based polygenic score in terms of bias, root mean squared error, and statistical power. We verify the empirical validity of these simulations by predicting educational attainment (EA) and height in a sample of siblings from the UK Biobank. We show that IV regression between-families approaches the SNP-based heritabilities, while compared to meta-analysis applying IV regression within-families provides a tighter lower bound on the direct genetic effect. IV estimation improves the predictive power of polygenic scores by 12% (height) to 22% (EA). Our findings suggest that measurement error is a key explanation for hidden heritability (i.e., the difference between SNP-based and GWAS-based heritability), and that it can be overcome using IV regression. We derive the practical rule of thumb that IV outperforms OLS when the correlation between the two polygenic scores used in IV regression is larger than √(10 / (N+10)), with N the sample size of the prediction sample.
It is well-established that birth weight is affected by the child's genetic endowments and maternal smoking during pregnancy. Here, we investigate whether an interaction between genetic endowments and maternal smoking on birth weight exists. We instrument the maternal smoking decision with a genetic variant (rs1051730) located in the nicotine receptor gene CHRNA3 and deal with the underreporting of maternal smoking by using a biomarker of nicotine collected during pregnancy. We confirm that genetic endowments and maternal smoking negatively affect the child's birth weight. However, we do not find evidence of meaningful interactions between genetic endowments and maternal smoking. JEL Codes: I100 Health: General; I140 Health and Inequality 1 The authors gratefully acknowledge funding from NORFACE under the Dynamics of Inequality (DIAL) programme (462-16-100).
Measurement error in polygenic indices (PGIs) attenuates the estimation of their effects in regression models. We analyze and compare two approaches addressing this attenuation bias: Obviously Related Instrumental Variables (ORIV) and the PGI Repository Correction (PGI-RC). Through simulations, we show that the PGI-RC performs slightly better than ORIV, unless the prediction sample is very small (N < 1000) or when there is considerable assortative mating. Within families, ORIV is the best choice since the PGI-RC correction factor is generally not available. We verify the empirical validity of the simulations by predicting educational attainment and height in a sample of siblings from the UK Biobank. We show that applying ORIV between families increases the standardized effect of the PGI by 12% (height) and by 22% (educational attainment) compared to a meta-analysis-based PGI, yet estimates remain slightly below the PGI-RC estimates. Furthermore, within-family ORIV regression provides the tightest lower bound for the direct genetic effect, increasing the lower bound for the standardized direct genetic effect on educational attainment from 0.14 to 0.18 (+29%), and for height from 0.54 to 0.61 (+13%) compared to a meta-analysis-based PGI.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.