Genomic selection offers great potential for increasing the rate of genetic improvement in plant breeding programs. This research used simulation to evaluate the effectiveness of different strategies for genotyping and phenotyping to enable genomic selection in early generation individuals (e.g., F2) in breeding programs involving biparental or similar (e.g., backcross or top cross) populations. By using phenotypes that were previously collected in other biparental populations, selection decisions could be made without waiting for phenotypes that pertain directly to the selection candidate to be collected, a process that would take at least three growing seasons. If these phenotypes were collected in biparental populations that were closely related to the selection candidates, only a small number of markers (e.g., 200–500) and a small number of phenotypes (e.g., 1000) were needed to achieve effective accuracy of estimated breeding values. If these phenotypes were collected in biparental populations that were not closely related to the selection candidates, as many as 10,000 markers and 5000 to 20,000 phenotypes were needed. Increasing marker density beyond 10,000 markers did not show benefit and in some scenarios reduced the accuracy of prediction. This study provides a guide, enabling resource allocation to be optimized between genotyping and phenotyping investment dependent on the population under development.
The density and utility of the molecular genetic linkage map of the widespread use of RFLP markers and maps in suncultivated sunflower (Helianthus annuus L.) has been greatly inflower has been restricted by a lack of public RFLP creased by the development and mapping of several hundred simple sequence repeat (SSR) markers. Of 1089 public SSR markers de-probes, consequent lack of a dense public RFLP map, scribed thus far, 408 have been mapped in a recombinant inbred line and low-throughput nature of RFLP markers. The diffi-(RIL) mapping population (RHA280 ϫ RHA801). The goal of the culties posed by the historic lack of public, single-copy present research was to increase the density of the sunflower map by DNA markers were only weakly offset by the emerconstructing a new RIL map (PHA ϫ PHB) based on SSRs, adding gence of facile, universal DNA markers, e.g., RAPDs loci for newly developed SSR markers to the RHA280 ϫ RHA801 RIL (Williams et al., 1990, 1993) and AFLPs (Vos et al., map, and integrating the restriction fragment length polymorphism 1995). RAPDs have primarily been used for tagging (RFLP) and SSR maps of sunflower. The latter was accomplished by phenotypic loci in sunflower, e.g., rust (Puccinia helianadding 120 SSR marker loci to a backbone of 80 RFLP marker loci thi Schw.) and Orobanche cumana Wallr. resistance on the HA370 ϫ HA372 F 2 map. The map spanned 1275.4 centimorgans (cM) and had a mean density of 6.3 cM per locus. The genes (Lawson et al., 1998; Lu et al., 2000). While RAPD PHA ϫ PHB SSR map was constructed from 264 SSR marker loci, and AFLP markers have a multitude of uses, both are spanned 1199.4 cM, and had a mean density of 4.5 cM per locus. The dominant, multicopy, and often nonspecific in nature RHA280 ϫ RHA801 map was constructed by adding 118 new SSR and, as a whole, unsatisfactory for establishing a geand insertion-deletion (INDEL) marker loci to 459 previously nome-wide framework of DNA markers for anchoring mapped SSR marker loci. The 577-locus map spanned 1423.0 cM and cross referencing genetic linkage maps. Single-copy, and had a mean density of 2.5 cM per locus. The three maps were codominant DNA markers, e.g., SSRs, are preferred for constructed from 1044 DNA marker loci (701 unique SSR and 89 such purposes and, until recently, have been lacking unique RFLP or INDEL marker loci) and supply a dense genomein sunflower. wide framework of sequence-based DNA markers for molecular breeding and genomics research in sunflower.
""Sparse testing" refers to reduced multi-environment breeding trials in which not all genotypes of interest are grown in each environment. Using genomic-enabled prediction and a model embracing genotype × environment interaction (GE), the non-observed genotype-in-environment combinations can be predicted. Consequently, the overall costs can be reduced and the testing capacities can be increased. The accuracy of predicting the unobserved data depends on different factors including (1) how many genotypes overlap between environments, (2) in how many environments each genotype is grown, and (3) which prediction method is used. In this research, we studied the predictive ability obtained when using a fixed number of plots and different sparse testing designs. The considered designs included the extreme cases of (1) no overlap of genotypes between environments, and (2) complete overlap of the genotypes between environments. In the latter case, the prediction set fully consists of genotypes that have not been tested at all. Moreover, we gradually go from one extreme to the other considering (3) intermediates between the two previous cases with varying numbers of different or non-overlapping (NO)/overlapping (O) genotypes. The empirical study is built upon two different maize hybrid data sets consisting of different genotypes crossed to two different testers (T1 and T2) and each data set was analyzed separately. For each set, phenotypic records on yield from three different environments are available. Three different prediction models were implemented, two main effects models (M1 and M2), and a model (M3) including the genotype-by-environment interaction term (GE). The results showed that the genome-based model including GE (M3) captured more phenotypic variation than the models that did not include this component. Also, M3 provided higher prediction accuracy than models M1 and M2 for the different allocation scenarios. Reducing the size of the calibration sets decreased the prediction accuracy under all allocation designs with M3being the less affected model; however, using the genome-enabled models (i.e., M2 and M3) the predictive ability is recovered when more genotypes are tested across environments. Our results indicate that a substantial part of the testing resources can be saved when using genome-based models including GE for optimizing sparse testing designs.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.