Genomic selection (GS) models use genome-wide genetic information to predict genetic values of candidates of selection. Originally, these models were developed without considering genotype × environment interaction(G×E). Several authors have proposed extensions of the single-environment GS model that accommodate G×E using either covariance functions or environmental covariates. In this study, we model G×E using a marker × environment interaction (M×E) GS model; the approach is conceptually simple and can be implemented with existing GS software. We discuss how the model can be implemented by using an explicit regression of phenotypes on markers or using co-variance structures (a genomic best linear unbiased prediction-type model). We used the M×E model to analyze three CIMMYT wheat data sets (W1, W2, and W3), where more than 1000 lines were genotyped using genotyping-by-sequencing and evaluated at CIMMYT’s research station in Ciudad Obregon, Mexico, under simulated environmental conditions that covered different irrigation levels, sowing dates and planting systems. We compared the M×E model with a stratified (i.e., within-environment) analysis and with a standard (across-environment) GS model that assumes that effects are constant across environments (i.e., ignoring G×E). The prediction accuracy of the M×E model was substantially greater of that of an across-environment analysis that ignores G×E. Depending on the prediction problem, the M×E model had either similar or greater levels of prediction accuracy than the stratified analyses. The M×E model decomposes marker effects and genomic values into components that are stable across environments (main effects) and others that are environment-specific (interactions). Therefore, in principle, the interaction model could shed light over which variants have effects that are stable across environments and which ones are responsible for G×E. The data set and the scripts required to reproduce the analysis are publicly available as Supporting Information.
One of the most important applications of genomic selection in maize breeding is to predict and identify the best untested lines from biparental populations, when the training and validation sets are derived from the same cross. Nineteen tropical maize biparental populations evaluated in multienvironment trials were used in this study to assess prediction accuracy of different quantitative traits using low-density (~200 markers) and genotyping-by-sequencing (GBS) single-nucleotide polymorphisms (SNPs), respectively. An extension of the Genomic Best Linear Unbiased Predictor that incorporates genotype × environment (GE) interaction was used to predict genotypic values; cross-validation methods were applied to quantify prediction accuracy. Our results showed that: (1) low-density SNPs (~200 markers) were largely sufficient to get good prediction in biparental maize populations for simple traits with moderate-to-high heritability, but GBS outperformed low-density SNPs for complex traits and simple traits evaluated under stress conditions with low-to-moderate heritability; (2) heritability and genetic architecture of target traits affected prediction performance, prediction accuracy of complex traits (grain yield) were consistently lower than those of simple traits (anthesis date and plant height) and prediction accuracy under stress conditions was consistently lower and more variable than under well-watered conditions for all the target traits because of their poor heritability under stress conditions; and (3) the prediction accuracy of GE models was found to be superior to that of non-GE models for complex traits and marginal for simple traits.
Potato (Solanum tuberosum) is a staple food crop and is considered one of the main sources of carbohydrates worldwide. Late blight (Phytophthora infestans) and common scab (Streptomyces scabies) are two of the primary production constraints faced by potato farming. Previous studies have identified a few resistance genes for both late blight and common scab; however, these genes explain only a limited fraction of the heritability of these diseases. Genomic selection has been demonstrated to be an effective methodology for breeding value prediction in many major crops (e.g., maize and wheat). However, the technology has received little attention in potato breeding. We present the first genomic selection study involving late blight and common scab in tetraploid potato. Our data involves 4,110 (Single Nucleotide Polymorphisms, SNPs) and phenotypic field evaluations for late blight (n=1,763) and common scab (n=3,885) collected in seven and nine years, respectively. We report moderately high genomic heritability estimates (0.46 ± 0.04 and 0.45 ± 0.017, for late blight and common scab, respectively). The extent of genotype-by-year interaction was high for late blight and low for common scab. Our assessment of prediction accuracy demonstrates the applicability of genomic prediction for tetraploid potato breeding. For both traits, we found that more than 90% of the genetic variance could be captured with an additive model. For common scab, the highest prediction accuracy was achieved using an additive model. For late blight, small but statistically significant gains in prediction accuracy were achieved using a model that accounted for both additive and dominance effects. Using whole-genome regression models we identified SNPs located in previously reported hotspots regions for late blight, on genes associated with systemic disease resistance responses, and a new locus located in a WRKY transcription factor for common scab.
Genomic prediction uses DNA sequences and phenotypes to predict genetic values. In homogeneous populations, theory indicates that the accuracy of genomic prediction increases with sample size. However, differences in allele frequencies and in linkage disequilibrium patterns can lead to heterogeneity in SNP effects. In this context, calibrating genomic predictions using a large, potentially heterogeneous, training data set may not lead to optimal prediction accuracy. Some studies tried to address this sample size/homogeneity trade-off using training set optimization algorithms; however, this approach assumes that a single training data set is optimum for all individuals in the prediction set. Here, we propose an approach that identifies, for each individual in the prediction set, a subset from the training data (i.e., a set of support points) from which predictions are derived. The methodology that we propose is a Sparse Selection Index (SSI) that integrates Selection Index methodology with sparsity-inducing techniques commonly used for high-dimensional regression. The sparsity of the resulting index is controlled by a regularization parameter (λ); the G-BLUP (the prediction method most commonly used in plant and animal breeding) appears as a special case which happens when λ = 0. In this study, we present the methodology and demonstrate (using two wheat data sets with phenotypes collected in ten different environments) that the SSI can achieve significant (anywhere between 5-10%) gains in prediction accuracy relative to the G-BLUP.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.