The Collaborative Cross (CC) is a mouse genetic reference population whose range of applications includes quantitative trait loci (QTL) mapping. The design of a CC QTL mapping study involves multiple decisions, including which and how many strains to use, and how many replicates per strain to phenotype, all viewed within the context of hypothesized QTL architecture. Until now, these decisions have been informed largely by early power analyses that were based on simulated, hypothetical CC genomes. Now that more than 50 CC strains are available and more than 70 CC genomes have been observed, it is possible to characterize power based on realized CC genomes. We report power analyses based on extensive simulations and examine several key considerations: 1) the number of strains and biological replicates, 2) the QTL effect size, 3) the presence of population structure, and 4) the distribution of functionally distinct alleles among the founder strains at the QTL.We also provide general power estimates to aide in the design of future experiments. All analyses were conducted with our R package, SPARCC (Simulated Power Analysis in the Realized Collaborative Cross), developed for performing either large scale power analyses or those tailored to particular CC experiments. KEYWORDS recombinant inbred lines, haplotype association, allelic series, multiparental population, MPP, quantitative trait, complex trait 32 et al. 2014) 33 Nonetheless, QTL mapping power depends in part on the 34 number of strains available, and the number strains available 35 in the CC is, and will remain, far less than the 1,000 proposed 36 in Churchill et al. (2004): At the time of this work, mice were 37 QTL mapping power in Collaborative Cross 1 available for 59 CC strains from the UNC Systems Genetics Core, 38 with a subset from these 59 and an additional 11 expected to be 39 offered through the Jackson Laboratory (JAX), a total of 70 CC 40 strains potentially. 41 A reduction in strain numbers as a function of allelic incom-42 patibilities between subspecies (Shorter et al. 2017) was expected, 43 and winnowed the number of resulting CC strains down to 50-44 70. Although smaller than originally intended, this population 45 size reflects the biological and financial realities of maintaining a 46 sustainable mammalian genome reference population. [Whereas 47 129 3. Evaluation of QTL detection accuracy, power and false pos-130itive rate (FPR).
131These are described in detail below, after a description of the 132 genomic data that serves as the basis for the simulations.