Author contributions DCJ coordinated all analyses, isolated DNA for sequencing, analysed and filtered SNP calls, conducted diversity analysis and GWAS and drafted the manuscript. CR produced phenotype data for growth on various solid media and growth rates in liquid media. AR conducted analysis of dating using mitochondrial data. DS conducted GWAS. MP analysed all phenotype data. TM identified LTR transposon insertions and analysed transposon insertion data. FXM conducted crosses for analysis of spore viability ZI produced indel calls with Cortex. WL conducted analysis of recombination rate, linkage disequilibrium decay and PCA for distance between strains. TMKC assisted with phenotype and population analysis. RP analysed Cortex and GATK indel calls. MM conducted amino acid profiling. JLDL and AC produced automated measures of cell morphology. SB aligned reads and produced GATK SNP calls. GH analysed population structure using fineSTRUCTURE. BO'F estimated the TMRCA from the nuclear genome using ACG. TK identified LTR transposon insertions JTS produced de novo assemblies. LB developed the custom Workspace workflow Spotsizer. BT assisted with sequence analysis. DAB assisted with analysis of novel genes. TS assisted with strain verification. SC produced images of wild strains and assisted with strain verification. JEEUH assisted with SNP validation. LvT and MT assisted with LTR validation. LJ and JL assisted with manual measures of cell morphology and FACS. SA produced gene expression data. MF, KM and ND assisted with sequencing. WB initiated and assisted with strain collection. JH coordinated manual measures of cell morphology and FACS. RECS coordinated automated measures of cell morphology. MR coordinated amino acid profiling. NM conducted analysis of recombination, linkage disequilibrium and advised on aspects of diversity and GWAS. DJB advised on GWAS. RD facilitated sequencing. JB contributed to the initiation and development of the project and financed the JB laboratory. AccessionsSequence data are archived in the European Nucleotide Archive (www.ebi.ac.uk/ena/), Study Accessions PRJEB2733 and PRJEB6284 (Supplementary Table 7). All SNPs and indels were submitted to NCBI dbSNP (www.ncbi.nlm.nih.gov/SNP/). Accessions are 974514578-974688138 (SNPs) and 974702618-974688139 (indels). Europe PMC Funders Group AbstractNatural variation within species reveals aspects of genome evolution and function. The fission yeast Schizosaccharomyces pombe is an important model for eukaryotic biology, but researchers typically use one standard laboratory strain. To extend the utility of this model, we surveyed the genomic and phenotypic variation in 161 natural isolates. We sequenced the genomes of all strains, revealing moderate genetic diversity (π = 3 ×10 −3 ) and weak global population structure. We estimate that dispersal of S. pombe began within human antiquity (~340 BCE), and ancestors of these strains reached the Americas at ~1623 CE. We quantified 74 traits, revealing substantial heritable phenotypic diversity. We cond...
Linkage disequilibrium (LD) provides information about positional cloning, linkage, and evolution that cannot be inferred from other evidence, even when a correct sequence and a linkage map based on more than a handful of families become available. We present theory to construct an LD map for which distances are additive and population-specific maps are expected to be approximately proportional. For this purpose, there is only a modest difference in relative efficiency of haplotypes and diplotypes: resolving the latter into 2-locus haplotypes has significant cost or error and increases information by about 50%. LD maps for a cold spot in 19p13.3 and a more typical region in 3q21 are optimized by interval estimates. For a random sample and trustworthy map the value of LD at large distance can be predicted reliably from information over a small distance and does not depend on the evolutionary variance unless the sample size approaches the population size. Values of the association probability that can be distinguished from the value at large distance are determined not by population size but by time since a critical bottleneck. In these examples, omission of markers with significant HardyWeinberg disequilibrium does not improve the map, and widely discrepant draft sequences have similar estimates of the genetic parameters. The LD cold spot in 19p13.3 gives an unusually high estimate of time, supporting an argument that this relationship is general. As predicted for a region with ancient haplotypes or uniformly high recombination, there is no clear evidence of LD clustering. On the contrary, the 3q21 region is resolved into alternating blocks of stable and decreasing LD, as expected from crossover clustering. Construction of a genomewide LD map requires data not yet available, which may be complemented but not replaced by a catalog of haplotypes. P ositional cloning of genes for disease susceptibility depends on linkage and ''allelic association'' (also called ''linkage disequilibrium'' or LD). A cold spot for LD is an interval in which LD declines rapidly with distance: neither linkage nor LD is proportional to the sequence-based map. To the extent that LD mirrors recombination it can extend the low resolution of linkage: a cold spot for LD is a hot spot for recombination and vice versa. However, this correspondence is disturbed by other factors that cannot be reliably predicted. To the extent that these phenomena are important, both the physical and linkage maps are unreliable guides to LD. We need an LD map to facilitate positional cloning, extend the resolution of the linkage map, compare populations, infer their paleodemography, and detect selective sweeps and other events of evolutionary interest. LD mapping is at the stage of linkage maps nearly a century ago, with the same promise.The definitive property of a chromosome map, whether physical or genetic, is that its distances are additive. With this constraint, we require a standard LD map to which populationspecific maps are approximately proportional. Here...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.