Background
Heritability and genetic correlation can be estimated from genome-wide single-nucleotide polymorphism (SNP) data using various methods. We recently developed multivariate genomic-relatedness-based restricted maximum likelihood (MGREML) for statistically and computationally efficient estimation of SNP-based heritability ($$h^2_{\text{SNP}}$$
h
SNP
2
) and genetic correlation ($$\rho _G$$
ρ
G
) across many traits in large datasets. Here, we extend MGREML by allowing it to fit and perform tests on user-specified factor models, while preserving the low computational complexity.
Results
Using simulations, we show that MGREML yields consistent estimates and valid inferences for such factor models at low computational cost (e.g., for data on 50 traits and 20,000 individuals, a saturated model involving 50 $$h^2_{\text{SNP}}$$
h
SNP
2
’s, 1225 $$\rho _G$$
ρ
G
’s, and 50 fixed effects is estimated and compared to a restricted model in less than one hour on a single notebook with two 2.7 GHz cores and 16 GB of RAM). Using repeated measures of height and body mass index from the US Health and Retirement Study, we illustrate the ability of MGREML to estimate a factor model and test whether it fits the data better than a nested model. The MGREML tool, the simulation code, and an extensive tutorial are freely available at https://github.com/devlaming/mgreml/.
Conclusion
MGREML can now be used to estimate multivariate factor structures and perform inferences on such factor models at low computational cost. This new feature enables simple structural equation modeling using MGREML, allowing researchers to specify, estimate, and compare genetic factor models of their choosing using SNP data.