We compared the accuracies of four genomic-selection prediction methods as affected by marker density, level of linkage disequilibrium (LD), quantitative trait locus (QTL) number, sample size, and level of replication in populations generated from multiple inbred lines. Marker data on 42 two-row spring barley inbred lines were used to simulate high and low LD populations from multiple inbred line crosses: the first included many small full-sib families and the second was derived from five generations of random mating. True breeding values (TBV) were simulated on the basis of 20 or 80 additive QTL. Methods used to derive genomic estimated breeding values (GEBV) were random regression best linear unbiased prediction (RR-BLUP), Bayes-B, a Bayesian shrinkage regression method, and BLUP from a mixed model analysis using a relationship matrix calculated from marker data. Using the best methods, accuracies of GEBV were comparable to accuracies from phenotype for predicting TBV without requiring the time and expense of field evaluation. We identified a trade-off between a method's ability to capture marker-QTL LD vs. marker-based relatedness of individuals. The Bayesian shrinkage regression method primarily captured LD, the BLUP methods captured relationships, while Bayes-B captured both. Under most of the study scenarios, mixed-model analysis using a marker-derived relationship matrix (BLUP) was more accurate than methods that directly estimated marker effects, suggesting that relationship information was more valuable than LD information. When markers were in strong LD with large-effect QTL, or when predictions were made on individuals several generations removed from the training data set, however, the ranking of method performance was reversed and BLUP had the lowest accuracy. proper posterior distributions-of-effect estimates and to better estimate QTL locations. We can regard these Bayesian shrinkage regression methods as another form of genomic selection when using them to predict breeding values. In previous work, these genomic-selection prediction methods have predominantly been compared under a specific simulation scenario such that their relative strengths under different conditions of linkage disequilibrium (LD), marker density, training data set size, and distribution of QTL effects are not Supporting information is available online at
To develop inbred lines, parents are crossed to generate segregating populations from which superior inbred progeny are selected. The value of a particular cross thus depends on the expected performance of its best progeny, which we call the superior progeny value. Superior progeny value is a linear combination of the mean of the cross's progeny and their standard deviation. In this study we specify theory to predict a cross's progeny standard deviation from QTL results and explore analytically and by simulation the variance of that standard deviation under different genetic models. We then study the impact of different QTL analysis methods on the prediction accuracy of a cross's superior progeny value. We show that including all markers, rather than only markers with significant effects, improves the prediction. Methods that account for the uncertainty of the QTL analysis by integrating over the posterior distributions of effect estimates also produce better predictions than methods that retain only point estimates from the QTL analysis. The utility of including estimates of a cross's among-progeny standard deviation in the prediction increases with increasing heritability and marker density but decreasing genome size and QTL number. This utility is also higher if crosses are envisioned only among the best parents rather than among all parents. Nevertheless, we show that among crosses the variance of progeny means is generally much greater than the variance of progeny standard deviations, restricting the utility of estimates of progeny standard deviations to a relatively small parameter space. I N inbred line development, parents are crossed to generate segregating populations from which superior inbred progeny are selected. The value of a particular cross depends on the performance of its best progeny rather than on its mean progeny performance. In a typical breeding program, far too many crosses are possible between elite candidate parents for exhaustive evaluation. For example, among 50 elite parents there are 1225 possible crosses. Even if it were feasible to evaluate a sufficient set of progeny from all those crosses, it is unlikely that that would be efficient. Rather, one would want to predict, among possible crosses, which ones are most likely to lead to superior inbred lines.Schnell and Utz (1975) introduced the usefulness concept for line development. Their definition of the usefulness of the cross m was U m ¼ m m 1 DG m ¼ m m 1 is GðmÞ h m , where m m is the population mean of homozygous lines that can be derived from cross m, s 2 GðmÞ is the genetic variance among these lines, h m is the square root of the heritability, and i is the standardized selection intensity. Two other criteria for similar usefulness are the varietal ability (Wright 1974;Gallais 1979) and the probability of obtaining transgressive segregants ( Jinks and Pooni 1976). Here, rather than focus on the genetic gain that might be obtained within a cross, we sought a simpler characterization that expresses which crosses would...
Genomewide selection within an A/B biparental cross is most advantageous if it could be effectively done before the cross is phenotyped. Our objectives were to determine if a general combining ability (GCA) model is useful for genomewide selection in an A/B cross, and to assess the influence of training population size (NGCA), number of crosses pooled into the training population (N×), linkage disequilibrium (r2), and heritability (h2) on the prediction accuracy with the GCA model. The GCA model involved pooling 4 to 38 maize crosses with A and B as one of the parents into the training population for an A/B cross, whereas the same background (SB) model involved pooling crosses between random inbreds. Across 30 A/B test populations, the mean response to selection (R) with the GCA model was 0.19 Mg ha–1 for testcross grain yield, –6 g kg–1 for moisture, and 0.38 kg hL–1 for test weight. These R values with the GCA model were 68 to 76% of the corresponding R values with phenotypic selection (PS). The R values with the SB model were only 15 to 28% of the R values with PS. Increasing the size of the training population with random crosses from the same heterotic group was less important than including crosses with A and B as one of the parents. Prediction accuracy was most highly correlated with h2r2NitalicGCA and h2r2N×. Our results indicated that the GCA model is routinely effective for genomewide selection within A/B crosses, before phenotyping the progeny in the cross.
Abbreviations: DH, doubled haploid; E(r MG ), expected prediction accuracy; h 2 , heritability; M e , effective number of chromosome segments; N, population size; N M , number of markers; QTL, quantitative trait locus; r 2 , linkage disequilibrium; r MG , correlation between predicted and true genotypic values; RR-BLUP, ridge regression-best linear unbiased prediction; SNP, single nucleotide polymorphism; V G , genetic variance; V R , nongenetic variance.
Good methods are lacking for predicting the genetic variance (VG) in biparental populations. Our objective was to determine whether genomewide marker effects and related populations could be used to predict the VG when two parents (A and B) are crossed to form a segregating population. For each of 85 A/B populations, 2 to 23 maize (Zea mays L.) populations with A and B as one of the parents were used as the training population. In the genomewide selection model, the testcross VG in A/B was predicted as the variance among the predicted genotypic values of progeny from a simulated A/B population. In the mean variance model, VG in A/B was predicted as the mean of VG in a series of A/* populations and */B populations, where * denotes a random parent. The correlations between observed and predicted VG were significant (P = 0.05) for both the genomewide selection model (0.18 for yield, 0.49 for moisture, and 0.52 for test weight) and the mean variance model (0.26 for yield, 0.46 for moisture, and 0.50 for test weight). The percentages of bias in estimates of VG were −28 to −60% for the genomewide selection model, but were only −1 to 5% for the mean variance model. Our results indicated that the VG in an A/B population could be predicted as the mean variance among populations with A and B as one of the parents. The mean variance model should be practical in breeding programs because it simply uses phenotypic data from prior, related populations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.