Intense structuring of plant breeding populations challenges the design of the training set (TS) in genomic selection (GS). An important open question is how the TS should be constructed from multiple related or unrelated small biparental families to predict progeny from individual crosses. Here, we used a set of five interconnected maize (Zea mays L.) populations of doubled-haploid (DH) lines derived from four parents to systematically investigate how the composition of the TS affects the prediction accuracy for lines from individual crosses. A total of 635 DH lines genotyped with 16,741 polymorphic SNPs were evaluated for five traits including Gibberella ear rot severity and three kernel yield component traits. The populations showed a genomic similarity pattern, which reflects the crossing scheme with a clear separation of full sibs, half sibs, and unrelated groups. Prediction accuracies within full-sib families of DH lines followed closely theoretical expectations, accounting for the influence of sample size and heritability of the trait. Prediction accuracies declined by 42% if full-sib DH lines were replaced by half-sib DH lines, but statistically significantly better results could be achieved if half-sib DH lines were available from both instead of only one parent of the validation population. Once both parents of the validation population were represented in the TS, including more crosses with a constant TS size did not increase accuracies. Unrelated crosses showing opposite linkage phases with the validation population resulted in negative or reduced prediction accuracies, if used alone or in combination with related families, respectively. We suggest identifying and excluding such crosses from the TS. Moreover, the observed variability among populations and traits suggests that these uncertainties must be taken into account in models optimizing the allocation of resources in GS.G ENOMIC prediction or selection, initially proposed and rapidly implemented in animal breeding (Meuwissen et al. 2001), is increasingly applied in plant breeding (Bernardo and Yu 2009;Lorenz et al. 2011;Morrell et al. 2011). However, with more investigations in plant breeding applications, new challenges emerge, mainly as the result of the greater possibilities of genetic manipulation and reproduction modes in plants compared to animals. Genomic predictions within diverse populations (Crossa et al. 2010; Riedelsheimer et al. 2012a,b;Windhausen et al. 2012) largely overlap with the scenarios in animal breeding. Predicting crossbred performance in animals also has some similarities with the prediction of maize hybrids from fully homozygous inbred lines drawn from genetically distant heterotic pools (Technow et al. 2012). Apart from using mixtures of close and distant relatives, little overlap with the field of animal breeding however exists in the case of multiple related or unrelated biparental families of inbred lines with a size #200. The latter situation is investigated in this study, because it covers the most rele...