Pedigree information is often missing for some animals in a breeding program. Unknown-parent groups (UPGs) are assigned to the missing parents to avoid biased genetic evaluations. Although the use of UPGs is well-established for the pedigree model, it is unclear how UPGs are integrated into the inverse of the unified relationship matrix (H-inverse) required for single-step genomic best linear unbiased prediction. A generalization of the UPG model is the metafounder (MF) model. The objectives of this study were to derive three H-inverses and to compare genetic trends among models with UPG and MF H-inverses using a simulated purebred population. All inverses were derived using the joint density function of the random breeding values and genetic groups. The breeding values of genotyped animals () were assumed to be adjusted for UPG effects (g) using matrix as =+g before incorporating genomic information. The Quaas-Pollak-transformed (QP) H-inverse was derived using a joint density function of and g updated with genomic information and assuming nonzero cov(,g'). The modified QP (Altered) H-inverse also assumes that the genomic information updates and g, but cov(,g')=0. The UPG-encapsulated (EUPG) H-inverse assumed genomic information updates the distribution of . The EUPG H-inverse had the same structure as the MF H-inverse. Fifty percent of the genotyped females in the simulation had a missing dam, and missing parents were replaced with UPGs by generation. The simulation study indicated that and g in models using the QP and Altered H-inverses may be inseparable leading to potential biases in genetic trends. Models using the EUPG and MF H-inverses showed no genetic trend biases. These two H-inverses yielded the same genomic EBV (GEBV). The predictive ability and inflation of GEBVs from young genotyped animals were nearly identical among models using the QP, Altered, EUPG, and MF H-inverses. Although the choice of H-inverse in real applications with enough data may not result in biased genetic trends, the EUPG and MF H-inverses are to be preferred because of theoretical justification and possibility to reduce biases.
The objective of this study was to determine whether the linear regression (LR) method could be used to validate genomic threshold models. Statistics for the LR method were computed from estimated breeding values (EBVs) using the whole and truncated data sets with variances from the reference and validation populations. The method was tested using simulated and real chicken data sets. The simulated data set included 10 generations of 4,500 birds each; genotypes were available for the last three generations. Each animal was assigned a continuous trait, which was converted to a binary score assuming an incidence of failure of 7%. The real data set included the survival status of 186,596 broilers (mortality rate equal to 7.2%) and genotypes of 18,047 birds. Both data sets were analysed using best linear unbiased predictor (BLUP) or single‐step GBLUP (ssGBLUP). The whole data set included all phenotypes available, whereas in the partial data set, phenotypes of the most recent generation were removed. In the simulated data set, the accuracies based on the LR formulas were 0.45 for BLUP and 0.76 for ssGBLUP, whereas the correlations between true breeding values and EBVs (i.e. true accuracies) were 0.37 and 0.65, respectively. The gain in accuracy by adding genomic information was overestimated by 0.09 when using the LR method compared to the true increase in accuracy. However, when the estimated ratio between the additive variance computed based on pedigree only and on pedigree and genomic information was considered, the difference between true and estimated gain was <0.02. Accuracies of BLUP and ssGBLUP with the real data set were 0.41 and 0.47, respectively. This small improvement in accuracy when using ssGBLUP with the real data set was due to population structure and lower heritability. The LR method is a useful tool for estimating improvements in accuracy of EBVs due to the inclusion of genomic information when traditional validation methods as k‐fold validation and predictive ability are not applicable.
The objectives of this study were to develop an efficient algorithm for calculating prediction error variances (PEV) for GBLUP models using the Algorithm for Proven and Young (APY), extend it to single-step GBLUP (ssGBLUP), and to apply this algorithm for approximating the theoretical reliabilities for single and multiple trait models in ssGBLUP. The PEV with APY was calculated by block-sparse inversion, efficiently exploiting the sparse structure of the inverse of the genomic relationship matrix with APY. Single-step GBLUP reliabilities were approximated by combining reliabilities with and without genomic information in terms of effective record contributions. Multi-trait reliabilities relied on single-trait results adjusted using the genetic and residual covariance matrices among traits. Tests involved two datasets provided by the American Angus Association. A small dataset (Data1) was used for comparing the approximated reliabilities with the reliabilities obtained by the inversion of the left-hand side of the mixed model equations. The large dataset (Data2) was used for evaluating the computational performance of the algorithm. Analyses with both datasets used single-trait and three-trait models. The number of animals in the pedigree ranged from 167,951 in Data1 to 10,213,401 in Data2, with 50,000 and 20,000 genotyped animals for single-trait and multiple trait-analysis, respectively, in Data1 and 335,325 in Data2. Correlations between estimated and exact reliabilities obtained by inversion ranged from 0.97 to 0.99, whereas the intercept and slope of the regression of the exact on the approximated reliabilities ranged from 0.00 to 0.04 and from 0.93 to 1.05, respectively. For the three-trait model with the largest dataset (Data2), the elapsed time for the reliability estimation was eleven minutes. The computational complexity of the proposed algorithm increased linearly with the number of genotyped animals and with the number of traits in the model. This algorithm can efficiently approximate the theoretical reliability of genomic estimated breeding values in ssGBLUP with APY for large numbers of genotyped animals at a low cost.
Background Single-step genomic predictions obtained from a breeding value model require calculating the inverse of the genomic relationship matrix $$({\mathbf{G}}^{-1})$$ ( G - 1 ) . The Algorithm for Proven and Young (APY) creates a sparse representation of $${\mathbf{G}}^{-1}$$ G - 1 with a low computational cost. APY consists of selecting a group of core animals and expressing the breeding values of the remaining animals as a linear combination of those from the core animals plus an error term. The objectives of this study were to: (1) extend APY to marker effects models; (2) derive equations for marker effect estimates when APY is used for breeding value models, and (3) show the implication of selecting a specific group of core animals in terms of a marker effects model. Results We derived a family of marker effects models called APY-SNP-BLUP. It differs from the classic marker effects model in that the row space of the genotype matrix is reduced and an error term is fitted for non-core animals. We derived formulas for marker effect estimates that take this error term in account. The prediction error variance (PEV) of the marker effect estimates depends on the PEV for core animals but not directly on the PEV of the non-core animals. We extended the APY-SNP-BLUP to include a residual polygenic effect and accommodate non-genotyped animals. We show that selecting a specific group of core animals is equivalent to select a subspace of the row space of the genotype matrix. As the number of core animals increases, subspaces corresponding to different sets of core animals tend to overlap, showing that random selection of core animals is algebraically justified. Conclusions The APY-(ss)GBLUP models can be expressed in terms of marker effect models. When the number of core animals is equal to the rank of the genotype matrix, APY-SNP-BLUP is identical to the classic marker effects model. If the number of core animals is less than the rank of the genotype matrix, genotypes for non-core animals are imputed as a linear combination of the genotypes of the core animals. For estimating SNP effects, only relationships and estimated breeding values for core animals are needed.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.