1Genetic variance of a phenotypic trait can originate from direct genetic effects, or from indirect effects, i.e. through genetic effects on other traits, affecting the trait of interest. This distinction is often of great importance, for example when trying to improve crop yield and simultaneously controlling plant height. As suggested by Sewall Wright, assessing contributions of direct and indirect effects requires knowledge of (1) the presence or absence of direct genetic effects on each trait, and (2) the functional relationships between the traits. Because experimental validation of such relationships is often unfeasible, it is increasingly common to reconstruct them using causal inference methods. However, current methods require all genetic variance to be explained by a small number of QTLs with fixed effects. Only few authors considered the 'missing heritability' case, where contributions of many undetectable QTLs are modelled with random effects. Usually, these are treated as nuisance terms, that need to be eliminated by taking residuals from a multi-trait mixed model (MTM). But fitting such MTM is challenging, and it is impossible to infer the presence of direct genetic effects. Here we propose an alternative strategy, where genetic effects are formally included in the graph. This has important advantages: (1) genetic effects can be directly incorporated in causal inference, leading to the PCgen algorithm, which can analyze many more traits (2) we can test the existence of direct genetic effects, and improve the orientation of edges between traits. Finally, we show that reconstruction is much more accurate if individual plant or plot data are used. We have implemented the PCgen-algorithm in the R-package pcgen. 2 3 4 5 6 7 8 9 10 11 12 13 14 15KEYWORDS Structural Equation Models, multivariate mixed models, causal inference 16 17 18 22 the genomic prediction applications are based on linear mixed-23 or Bayesian models that predict the phenotype for the target 24 trait (yield) as a function of a multivariate distribution for SNP 25 effects. In these models, the physiological mechanisms and traits 26 that modulate the genotypic response to the environment over 27 time are modeled implicitly via the SNP effects on the target 28 trait (Zhou and Stephens 2014; Calus and Veerkamp 2011). The 29 availability of high throughput phenotyping technologies has 30 enabled breeders to characterize additional traits and monitor 31 growth and development during the season. This opens new 32 opportunities in breeding strategies, in which better-adapted 33 genotypes result from combining loci that regulate complemen-34 tary physiological mechanisms. This kind of breeding strategy 35 is called physiological breeding (Reynolds and Langridge 2016).
36In physiological breeding, prediction accuracy for the target 37 trait benefits from modeling the target trait and its underly-38 ing traits simultaneously (van Eeuwijk et al. 2018) because of a 39 larger power to estimate its underlying effects (Stephens 2013).40The challenge of...