The remarkable evolutionary history of the common bean (Phaseolus vulgaris L.) has led to the emergence of three wild main genepools corresponding to three different ecogeographic areas: Mesoamerica, the Andes and northern Peru/Ecuador. Recent works proposed novel scenarios and the northern Peru/Ecuador population has been described as a new species called P. debouckii, rekindling the debate about the origin of P. vulgaris. Here we shed light on the origin of P. vulgaris by analysing the chloroplast and nuclear genomes of a large varietal collection representing the entire geographical distribution of wild forms. We assembled 37 chloroplast genomes de novo and used them to construct a time frame for the divergence of the genotypes under investigation, revealing that the separation of the Mesoamerican and northern Peru/Ecuador genepools occurred ~0.15 Mya. Our results clearly support a Mesoamerican origin of the common bean and reject the recent P. deboukii hypothesis. These results also imply two independent migratory events from Mesoamerica to the North and South Andes, probably facilitated by birds. Our work represents a paradigmatic example of the importance of taking into account recombination events when investigating phylogeny and of the analysis of wild forms when studying the evolutionary history of a crop species.
High-throughput genotyping enables the large-scale analysis of genetic diversity in population genomics and genome-wide association studies that combine the genotypic and phenotypic characterization of large collections of accessions. Sequencing-based approaches for genotyping are progressively replacing traditional genotyping methods due to the lower ascertainment bias. However, genome-wide genotyping based on sequencing becomes expensive in species with large genomes and a high proportion of repetitive DNA. Here we describe the use of CRISPR-Cas9 technology to deplete repetitive elements in the 3.76-Gb genome of lentil (Lens culinaris), 84% consisting of repeats, thus concentrating the sequencing data on coding and regulatory regions (single-copy regions). We designed a custom set of 566,766 gRNAs targeting 2.9 Gbp of repeats and excluding repetitive regions overlapping annotated genes and putative regulatory elements based on ATAC-seq data. The novel depletion method removed ~40% of reads mapping to repeats, increasing those mapping to single-copy regions by ~2.6-fold. When analyzing 25 million fragments, this repeat-to-single-copy shift in the sequencing data increased the number of genotyped bases of ~10-fold compared to nondepleted libraries. In the same condition, we were also able to identify ~12-fold more genetic variants in the single-copy regions and increased the genotyping accuracy by rescuing thousands of heterozygous variants that otherwise would be missed due to low coverage. The method performed similarly regardless of the multiplexing level, type of library or genotypes, including different cultivars and a closely-related species (L. orientalis). Our results demonstrated that CRISPR-Cas9-driven repeat depletion focuses sequencing data on meaningful genomic regions, thus improving high-density and genome-wide genotyping in large and repetitive genomes.
During citizen-science expeditions to the Ulu Temburong National Park, Brunei, several individuals were collected of a semi-slug species of the genus Microparmarion that, based on morphology and in-the-field DNA-barcoding, was found to be an undescribed species. In this paper, we describe Microparmarion sallehi Wu, Ezzwan & Hamdani, n. sp., after field centre supervisor Md Salleh Abdullah Bat. We provide details on the external and internal reproductive morphology, the shell and the ecology of the type locality, as well as a diagnosis comparing it with related species. DNA barcodes were generated for five individuals and used for a phylogenetic reconstruction. Microparmarion sallehi sp. n. and M. exquadratus Schilthuizen et al., 2019 so far are the only Bornean species of the genus that live in lowland forest; other species are found in montane forests.
High-throughput chromosome conformation capture (Hi-C) is widely used for scaffolding in de novo assembly because it produces highly contiguous genomes, but its indirect statistical approach can introduce connection errors. We employed optical mapping (Bionano Genomics) as an orthogonal scaffolding technology to assess the structural solidity of Hi-C reconstructed scaffolds. Optical maps were used to assess the correctness of five de novo genome assemblies based on long-read sequencing for contig generation and Hi-C for scaffolding. Hundreds of inconsistencies were found between the reconstructions generated using the Hi-C and optical mapping approaches. Manual inspection, exploiting raw long-read sequencing data and optical maps, confirmed that several of these conflicts were derived from Hi-C joining errors. Such misjoins were widespread, involved the connection of both small and large contigs, and even overlapped annotated genes. We conclude that the integration of optical mapping data after, not before, Hi-C-based scaffolding, improves the quality of the assembly and limits reconstruction errors by highlighting misjoins that can then be subjected to further investigation.
High-throughput genotyping facilitates the large-scale analysis of genetic diversity in population genomics and genome-wide association studies that combine the genotypic and phenotypic characterization of large collections of wild and domesticated germplasm. Genotyping by sequencing is progressively replacing traditional genotyping methods due to the lower ascertainment bias. However, genome-wide genotyping by sequencing becomes expensive in species with large genomes and a high proportion of repetitive DNA. Here we describe the use of CRISPR/Cas9 technology to deplete repetitive elements in the 3.76-Gb genome of lentil (Lens culinaris), 84% of which consists of repeats, thus concentrating the sequencing data on coding and regulatory regions (unique regions). We designed a custom set of 566,722 gRNAs, each with at least 25 recognition sites, targeting 2.9 Gbp of repeats in 500-bp insert sequencing libraries. We excluded repetitive regions overlapping annotated genes and putative regulatory elements based on ATAC-Seq data. The novel depletion method removed 40% of reads mapping to repeats, increasing those mapping to unique and functional regions by 2.6-fold. This repeat-to-unique shift in the sequencing data increased the number of genotyped bases by up to 17-fold compared to non-depleted libraries. We were also able to identify up to 18-fold more genetic variants in the unique regions and increased the genotyping accuracy by rescuing thousands of heterozygous variants that otherwise would be missed due to low coverage. The method performed similarly regardless of the multiplexing level, type of library or genotypes, including different cultivars and a closely-related species (L. orientalis). Our results confirmed that CRISPR/Cas9-driven repeat depletion focuses sequencing data on meaningful genomic regions, helping to improve high-density and genome-wide genotyping in large and repetitive genomes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.