Species identification using DNA sequences, known as DNA barcoding has been widely used in many applied fields. Current barcoding methods are usually based on a single mitochondrial locus, such as cytochrome c oxidase subunit I (COI). This type of barcoding method does not always work when applied to species separated by short divergence times or that contain introgressed genes from closely related species. Herein we introduce a more effective multi-locus barcoding framework that is based on gene capture and “next-generation” sequencing. We selected 500 independent nuclear markers for ray-finned fishes and designed a three-step pipeline for multilocus DNA barcoding. We applied our method on two exemplar datasets each containing a pair of sister fish species: Siniperca chuatsi vs. Sini. kneri and Sicydium altum vs. Sicy. adelum, where the COI barcoding approach failed. Both of our empirical and simulated results demonstrated that under limited gene flow and enough separation time, we could correctly identify species using multilocus barcoding method. We anticipate that, as the cost of DNA sequencing continues to fall that our multilocus barcoding approach will eclipse existing single-locus DNA barcoding methods as a means to better understand the diversity of the living world.
Gene capture coupled with the next‐generation sequencing has become one of the preferred methods of subsampling genomes for phylogenomic studies. Many exon markers have been developed in plants, sharks, frogs, reptiles, fishes, and others, but no universal exon markers have been tested in ray‐finned fishes. Here, we identified a suite of “single‐copy” protein‐coding sequence (CDS) markers through comparing eight fish genomes, and tested them empirically in 83 species (33 families and nine orders or higher clades: Acipenseriformes, Lepisosteiformes, Elopomorpha, Osteoglossomorpha, Clupeiformes, Cypriniformes, Gobiaria, Carangaria, and Eupercaria; sensu Betancur et al. 2013). Sorting the markers according to their completeness and phylogenetic decisiveness in taxa tested resulted in a selection of 4,434 markers, which were proven to be useful in reconstructing phylogenies of the ray‐finned fishes at different taxonomic levels. We also proposed a strategy of refining baits (probes) design a posteriori based on empirical data. The markers that we have developed may greatly enrich the batteries of exon markers for phylogenomic study in ray‐finned fishes.
Species identification using DNA sequences, known as DNA barcoding has been widely used in many applied fields. Current barcoding methods are usually based on a single mitochondrial locus, such as cytochrome c oxidase subunit I (COI). This type of barcoding is not always effective when applied to species separated by short divergence times or that contain introgressed genes from closely related species. Herein we introduce a more effective multi-locus barcoding framework that is based on gene capture and “next-generation” sequencing and provide both empirical and simulation tests of its efficacy. We examine genetic distinctness in two pairs of fishes that are sister-species: Siniperca chuatsi vs. S. kneri and Sicydium altum vs. S. adelum, where the COI barcoding approach failed species identification in both cases. Results revealed that distinctness between S. chuatsi and S. kneri increased as more independent loci were added. By contrast S. altum and S. adelum could not be distinguished even with all loci. Analyses of population structure and gene flow suggested that the two species of Siniperca diverged from each other a long time ago but have unidirectional gene flow, whereas the two species of Sicydium are not separated from each other and have high bidirectional gene flow. Simulations demonstrate that under limited gene flow (< 0.00001 per gene per generation) and enough separation time (> 100000 generation), we can correctly identify species using more than 90 loci. Finally, we selected 500 independent nuclear markers for ray-finned fishes and designed a three-step pipeline for multilocus DNA barcoding.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.