Simulations of close relatives and identical by descent (IBD) segments are common in genetic studies, yet most past efforts have utilized sex averaged genetic maps and ignored crossover interference, thus omitting features known to affect the breakpoints of IBD segments. We developed Ped-sim, a method for simulating relatives that can utilize either sex-specific or sex averaged genetic maps and also either a model of crossover interference or the traditional Poisson model for inter-crossover distances. To characterize the impact of previously ignored mechanisms, we simulated data for all four combinations of these factors. We found that modeling crossover interference decreases the standard deviation of the IBD proportion by 10.4% on average in full siblings through second cousins. By contrast, sex-specific maps increase this standard deviation by 4.2% on average, and also impact the number of segments relatives share. Most notably, using sex-specific maps, the number of segments half-siblings share is bimodal; and when combined with interference modeling, the probability that sixth cousins have non-zero IBD ranges from 9.0 to 13.1%, depending on the sexes of the individuals through which they are related. We present new analytical results for the distributions of IBD segments under these models and show they match results from simulations. Finally, we compared IBD sharing rates between simulated and real relatives and find that the combination of sex-specific maps and interference modeling most accurately captures IBD rates in real data. Ped-sim is open source and available from https://github.com/williamslab/ped-sim.
Short title: Crossover interference and sex-specific maps shape IBD distributionsSimulations are ubiquitous throughout statistical genetics in order to generate data with known properties, enabling tests of inference methods and analyses of real world processes in settings where experimental data are challenging to collect. Simulating genetic data for relatives in a pedigree requires the synthesis of chromosomes parents transmit to their children. These chromosomes form as a mosaic of a given parent's two chromosomes, with the location of switches between the two parental chromosomes known as crossovers. Detailed information about crossover generation based on real data from humans now exists, including the fact that men and women have overall different rates (women produce ∼1.6 times more crossovers) and that real crossovers are subject to interference-whereby crossovers are further apart from one another than expected under a model that selects their locations randomly. Our new method, Ped-sim, can simulate pedigree data using these less commonly modeled crossover features, and we used it to evaluate the importance of sex-specific rates and interference in real data. These comparisons show that both factors shape the amount of DNA two relatives share identically, and that their inclusion in models of crossover better fit data from real relatives.