Conventional prediction of dairy cattle merit involves setting up and solving linear equations with the number of unknowns being the number of animals, typically millions, multiplied by the number of traits being simultaneously assessed. The coefficient matrix has been large and sparse and iteration on data has been the method of choice, whereby the coefficient matrix is not stored but recreated as needed. In contrast, genomic prediction involves assessment of the merit of genome fragments characterized by single nucleotide polymorphism genotypes, currently some 50,000, which can then be used to predict the merit of individual animals according to the fragments they have inherited. The prediction equations for chromosome fragments typically have fewer than 100,000 unknowns, but the number of observations used to predict the fragment effects can be one-tenth the number of fragments. The coefficient matrix tends to be dense and the resulting system of equations can be ill behaved. Equivalent computing algorithms for genomic prediction were derived. The number of unknowns in the equivalent system grows with number of genotyped animals, usually bulls, rather than the number of chromosome fragment effects. In circumstances with fewer genotyped animals than single nucleotide polymorphism genotypes, these equivalent computations allow the solving of a smaller system of equations that behaves numerically better. There were 3 solving strategies compared: 1 method that formed and stored the coefficient matrix in memory and 2 methods that iterate on data. Finally, formulas for reliabilities of genomic predictions of merit were developed.
This study investigated the possibility of increasing the reliability of direct genomic values (DGV) by combining reference populations. The data were from 3,735 bulls from Danish, Swedish, and Finnish Red dairy cattle populations. Single nucleotide polymorphism markers were fitted as random variables in a Bayesian model, using published estimated breeding values as response variables. In total, 17 index traits were analyzed. Reliabilities were estimated using a 5-fold cross validation, and calculated as the within-year squared correlation between estimated breeding values and DGV. Marker effects were estimated using reference populations from individual countries, as well as using a combined reference population from all 3 countries. Single-country reference populations gave mean reliabilities across 17 traits of 0.19 to 0.23, whereas the combined reference gave mean reliabilities of 0.26 for all populations. Using marker effects from 1 population to predict the other 2 gave a loss in mean reliability of 0.14 to 0.21 when predicting Swedish or Finnish animals with Danish marker effects, or vice versa. Using Swedish or Finnish marker effects to predict each other only showed a loss in mean reliability of 0.03 to 0.05. A combined Swedish-Finnish reference population led to an average reliability as high as that from the 3-country reference population, but somewhat different for individual traits. The results from this study show that it is possible to increase the reliability of DGV by combining reference populations from related populations.
BackgroundGenomic data are used in animal breeding to assist genetic evaluation. Several models to estimate genomic breeding values have been studied. In general, two approaches have been used. One approach estimates the marker effects first and then, genomic breeding values are obtained by summing marker effects. In the second approach, genomic breeding values are estimated directly using an equivalent model with a genomic relationship matrix. Allele coding is the method chosen to assign values to the regression coefficients in the statistical model. A common allele coding is zero for the homozygous genotype of the first allele, one for the heterozygote, and two for the homozygous genotype for the other allele. Another common allele coding changes these regression coefficients by subtracting a value from each marker such that the mean of regression coefficients is zero within each marker. We call this centered allele coding. This study considered effects of different allele coding methods on inference. Both marker-based and equivalent models were considered, and restricted maximum likelihood and Bayesian methods were used in inference.ResultsTheoretical derivations showed that parameter estimates and estimated marker effects in marker-based models are the same irrespective of the allele coding, provided that the model has a fixed general mean. For the equivalent models, the same results hold, even though different allele coding methods lead to different genomic relationship matrices. Calculated genomic breeding values are independent of allele coding when the estimate of the general mean is included into the values. Reliabilities of estimated genomic breeding values calculated using elements of the inverse of the coefficient matrix depend on the allele coding because different allele coding methods imply different models. Finally, allele coding affects the mixing of Markov chain Monte Carlo algorithms, with the centered coding being the best.ConclusionsDifferent allele coding methods lead to the same inference in the marker-based and equivalent models when a fixed general mean is included in the model. However, reliabilities of genomic breeding values are affected by the allele coding method used. The centered coding has some numerical advantages when Markov chain Monte Carlo methods are used.
Continuous evaluation of dairy cattle with a random regression test-day model requires a fast solving method and algorithm. A new computing technique feasible in Jacobi and conjugate gradient based iterative methods using iteration on data is presented. In the new computing technique, the calculations in multiplication of a vector by a matrix were recorded to three steps instead of the commonly used two steps. The three-step method was implemented in a general mixed linear model program that used preconditioned conjugate gradient iteration. Performance of this program in comparison to other general solving programs was assessed via estimation of breeding values using univariate, multivariate, and random regression test-day models. Central processing unit time per iteration with the new three-step technique was, at best, one-third that needed with the old technique. Performance was best with the test-day model, which was the largest and most complex model used. The new program did well in comparison to other general software. Programs keeping the mixed model equations in random access memory required at least 20 and 435% more time to solve the univariate and multivariate animal models, respectively. Computations of the second best iteration on data took approximately three and five times longer for the animal and test-day models, respectively, than did the new program. Good performance was due to fast computing time per iteration and quick convergence to the final solutions. Use of preconditioned conjugate gradient based methods in solving large breeding value problems is supported by our findings.
The objectives of this study were to evaluate the feasibility of use of the test-day (TD) single-step genomic BLUP (ssGBLUP) using phenotypic records of Nordic Red Dairy cows. The critical point in ssGBLUP is how genomically derived relationships (G) are integrated with population-based pedigree relationships (A) into a combined relationship matrix (H). Therefore, we also tested how different weights for genomic and pedigree relationships affect ssGBLUP, validation reliability, and validation regression coefficients. Deregressed proofs for 305-d milk, protein, and fat yields were used for a posteriori validation. The results showed that the use of phenotypic TD records in ssGBLUP is feasible. Moreover, the TD ssGBLUP model gave considerably higher validation reliabilities and validation regression coefficients than the TD model without genomic information. No significant differences were found in validation reliability between the different TD ssGBLUP models according to bootstrap confidence intervals. However, the degree of inflation in genomic enhanced breeding values is affected by the method used in construction of the H matrix. The results showed that ssGBLUP provides a good alternative to the currently used multi-step approach but there is a great need to find the best option to combine pedigree and genomic information in the genomic matrix.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.