Our understanding of phylogenetic relationships among bony fishes has been transformed by analysis of a small number of genes, but uncertainty remains around critical nodes. Genome-scale inferences so far have sampled a limited number of taxa and genes. Here we leveraged 144 genomes and 159 transcriptomes to investigate fish evolution with an unparalleled scale of data: >0.5 Mb from 1,105 orthologous exon sequences from 303 species, representing 66 out of 72 ray-finned fish orders. We apply phylogenetic tests designed to trace the effect of whole-genome duplication events on gene trees and find paralogy-free loci using a bioinformatics approach. Genome-wide data support the structure of the fish phylogeny, and hypothesis-testing procedures appropriate for phylogenomic datasets using explicit gene genealogy interrogation settle some long-standing uncertainties, such as the branching order at the base of the teleosts and among early euteleosts, and the sister lineage to the acanthomorph and percomorph radiations. Comprehensive fossil calibrations date the origin of all major fish lineages before the end of the Cretaceous.
Phylogenomic studies using genome‐wide datasets are quickly becoming the state of the art for systematics and comparative studies, but in many cases, they result in strongly supported incongruent results. The extent to which this conflict is real depends on different sources of error potentially affecting big datasets (assembly, stochastic, and systematic error). Here, we apply a recently developed methodology (GGI or gene genealogy interrogation) and data curation to new and published datasets with more than 1000 exons, 500 ultraconserved element (UCE) loci, and transcriptomic sequences that support incongruent hypotheses. The contentious non‐monophyly of the order Characiformes proposed by two studies is shown to be a spurious outcome induced by sample contamination in the transcriptomic dataset and an ambiguous result due to poor taxonomic sampling in the UCE dataset. By exploring the effects of number of taxa and loci used for analysis, we show that the power of GGI to discriminate among competing hypotheses is diminished by limited taxonomic sampling, but not equally sensitive to gene sampling. Taken together, our results reinforce the notion that merely increasing the number of genetic loci for a few representative taxa is not a robust strategy to advance phylogenetic knowledge of recalcitrant groups. We leverage the expanded exon capture dataset generated here for Characiformes (206 species in 23 out of 24 families) to produce a comprehensive phylogeny and a revised classification of the order.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.