Key messagePopulation structure must be evaluated before optimization of the training set population. Maximizing the phenotypic variance captured by the training set is important for optimal performance.AbstractThe optimization of the training set (TRS) in genomic selection has received much interest in both animal and plant breeding, because it is critical to the accuracy of the prediction models. In this study, five different TRS sampling algorithms, stratified sampling, mean of the coefficient of determination (CDmean), mean of predictor error variance (PEVmean), stratified CDmean (StratCDmean) and random sampling, were evaluated for prediction accuracy in the presence of different levels of population structure. In the presence of population structure, the most phenotypic variation captured by a sampling method in the TRS is desirable. The wheat dataset showed mild population structure, and CDmean and stratified CDmean methods showed the highest accuracies for all the traits except for test weight and heading date. The rice dataset had strong population structure and the approach based on stratified sampling showed the highest accuracies for all traits. In general, CDmean minimized the relationship between genotypes in the TRS, maximizing the relationship between TRS and the test set. This makes it suitable as an optimization criterion for long-term selection. Our results indicated that the best selection criterion used to optimize the TRS seems to depend on the interaction of trait architecture and population structure.Electronic supplementary materialThe online version of this article (doi:10.1007/s00122-014-2418-4) contains supplementary material, which is available to authorized users.
Twelve field experiments comparing 24 durum wheat varieties from three periods-old (<1945), intermediate and modern (1988)(1989)(1990)(1991)(1992)(1993)(1994)(1995)(1996)(1997)(1998)(1999)(2000)-were carried out in order to ascertain the advances made in durum wheat yield components and related traits in Italian and Spanish germplasm. Grain yield improvements were based on linear increases in the number of grains per m 2 and harvest index, while grain weight and biomass remained unchanged. Yield per plant increased at a rate of 0.36 and 0.44% y -1 and the number of grains per m 2 improved by 39% and 55% in Italian and Spanish varieties, respectively. The mean rate of increase in the number of grains per m 2 was 0.55% y -1 . Plants per m 2 , spikes per plant and grains per spike contributed 20%, 29% and 51%, respectively, to the increase in the number of grains per m 2 . The enhance of the number of grains per m 2 was due to the greater grain set in the modern varieties, since the number of spikelets per spike remained unchanged. Harvest index increased overall by 0.48% y -1 (0.40 and 0.53% y -1 in Italian and Spanish varieties, respectively). Plant height was the trait that suffered the most dramatic changes (it decreased at a rate of -0.81% y -1 , with little difference between the varieties of the two countries), as consequence of the presence of the Rht-B1 dwarfing gene. Harvest index and plant height, which were the traits that most contributed to discriminating between periods, remained unchanged from 1980 to 2000. The higher rates of improvement in Spain are discussed in the context of the contrasting strategies followed to improve durum wheat yield in the two countries.
The results suggest that breeding reduced not only plant height, but also the time to anthesis. By extending the duration of the phase from booting to anthesis, which was associated with an increase in spike dry weight and grains per spike, it suggests that future increases in spike fertility could be achieved by enlarging that phase.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.