Insects are the most speciose group of animals, but the phylogenetic relationships of many major lineages remain unresolved. We inferred the phylogeny of insects from 1478 protein-coding genes. Phylogenomic analyses of nucleotide and amino acid sequences, with site-specific nucleotide or domain-specific amino acid substitution models, produced statistically robust and congruent results resolving previously controversial phylogenetic relations hips. We dated the origin of insects to the Early Ordovician [~479 million years ago (Ma)], of insect flight to the Early Devonian (~406 Ma), of major extant lineages to the Mississippian (~345 Ma), and the major diversification of holometabolous insects to the Early Cretaceous. Our phylogenomic study provides a comprehensive reliable scaffold for future comparative analyses of evolutionary innovations among insects.
Arthropods were the first animals to conquer land and air. They encompass more than three quarters of all described living species. This extraordinary evolutionary success is based on an astoundingly wide array of highly adaptive body organizations. A lack of robustly resolved phylogenetic relationships, however, currently impedes the reliable reconstruction of the underlying evolutionary processes. Here, we show that phylogenomic data can substantially advance our understanding of arthropod evolution and resolve several conflicts among existing hypotheses. We assembled a data set of 233 taxa and 775 genes from which an optimally informative data set of 117 taxa and 129 genes was finally selected using new heuristics and compared with the unreduced data set. We included novel expressed sequence tag (EST) data for 11 species and all published phylogenomic data augmented by recently published EST data on taxonomically important arthropod taxa. This thorough sampling reduces the chance of obtaining spurious results due to stochastic effects of undersampling taxa and genes. Orthology prediction of genes, alignment masking tools, and selection of most informative genes due to a balanced taxa-gene ratio using new heuristics were established. Our optimized data set robustly resolves major arthropod relationships. We received strong support for a sister group relationship of onychophorans and euarthropods and strong support for a close association of tardigrades and cycloneuralia. Within pancrustaceans, our analyses yielded paraphyletic crustaceans and monophyletic hexapods and robustly resolved monophyletic endopterygote insects. However, our analyses also showed for few deep splits that were recently thought to be resolved, for example, the position of myriapods, a remarkable sensitivity to methods of analyses.
Remipedes are a small and enigmatic group of crustaceans, first described only 30 years ago. Analyses of both morphological and molecular data have recently suggested a close relationship between Remipedia and Hexapoda. If true, the remipedes occupy an important position in pancrustacean evolution and may be pivotal for understanding the evolutionary history of crustaceans and hexapods. However, it is important to test this hypothesis using new data and new types of analytical approaches. Here, we assembled a phylogenomic data set of 131 taxa, incorporating newly generated 454 expressed sequence tag (EST) data from six species of crustaceans, representing five lineages (Remipedia, Laevicaudata, Spinicaudata, Ostracoda, and Malacostraca). This data set includes all crustacean species for which EST data are available (46 species), and our largest alignment encompasses 866,479 amino acid positions and 1,886 genes. A series of phylogenomic analyses was performed to evaluate pancrustacean relationships. We significantly improved the quality of our data for predicting putative orthologous genes and for generating data subsets by matrix reduction procedures, thereby improving the signal to noise ratio in the data. Eight different data sets were constructed, representing various combinations of orthologous genes, data subsets, and taxa. Our results demonstrate that the different ways to compile an initial data set of core orthologs and the selection of data subsets by matrix reduction can have marked effects on the reconstructed phylogenetic trees. Nonetheless, all eight data sets strongly support Pancrustacea with Remipedia as the sister group to Hexapoda. This is the first time that a sister group relationship of Remipedia and Hexapoda has been inferred using a comprehensive phylogenomic data set that is based on EST data. We also show that selecting data subsets with increased overall signal can help to identify and prevent artifacts in phylogenetic analyses.
Phylogenetic relationships of the primarily wingless insects are still considered unresolved. Even the most comprehensive phylogenomic studies that addressed this question did not yield congruent results. To get a grip on these problems, we here analyzed the sources of incongruence in these phylogenomic studies by using an extended transcriptome data set. Our analyses showed that unevenly distributed missing data can be severely misleading by inflating node support despite the absence of phylogenetic signal. In consequence, only decisive data sets should be used which exclusively comprise data blocks containing all taxa whose relationships are addressed. Additionally, we used Four-cluster Likelihood Mapping (FcLM) to measure the degree of congruence among genes of a data set, as a measure of support alternative to bootstrap. FcLM showed incongruent signal among genes, which in our case is correlated neither with functional class assignment of these genes nor with model misspecification due to unpartitioned analyses. The herein analyzed data set is the currently largest data set covering primarily wingless insects, but failed to elucidate their interordinal phylogenetic relationships. Although this is unsatisfying from a phylogenetic perspective, we try to show that the analyses of structure and signal within phylogenomic data can protect us from biased phylogenetic inferences due to analytical artifacts.
Background: Whenever different data sets arrive at conflicting phylogenetic hypotheses, only testable causal explanations of sources of errors in at least one of the data sets allow us to critically choose among the conflicting hypotheses of relationships. The large (28S) and small (18S) subunit rRNAs are among the most popular markers for studies of deep phylogenies. However, some nodes supported by this data are suspected of being artifacts caused by peculiarities of the evolution of these molecules. Arthropod phylogeny is an especially controversial subject dotted with conflicting hypotheses which are dependent on data set and method of reconstruction. We assume that phylogenetic analyses based on these genes can be improved further i) by enlarging the taxon sample and ii) employing more realistic models of sequence evolution incorporating nonstationary substitution processes and iii) considering covariation and pairing of sites in rRNA-genes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.