IntroductionTraditionally, genomic or transcriptomic data have been restricted to a few model or emerging model organisms, and to a handful of species of medical and/or environmental importance. Next-generation sequencing techniques have the capability of yielding massive amounts of gene sequence data for virtually any species at a modest cost. Here we provide a comparative analysis of de novo assembled transcriptomic data for ten non-model species of previously understudied animal taxa.ResultscDNA libraries of ten species belonging to five animal phyla (2 Annelida [including Sipuncula], 2 Arthropoda, 2 Mollusca, 2 Nemertea, and 2 Porifera) were sequenced in different batches with an Illumina Genome Analyzer II (read length 100 or 150 bp), rendering between ca. 25 and 52 million reads per species. Read thinning, trimming, and de novo assembly were performed under different parameters to optimize output. Between 67,423 and 207,559 contigs were obtained across the ten species, post-optimization. Of those, 9,069 to 25,681 contigs retrieved blast hits against the NCBI non-redundant database, and approximately 50% of these were assigned with Gene Ontology terms, covering all major categories, and with similar percentages in all species. Local blasts against our datasets, using selected genes from major signaling pathways and housekeeping genes, revealed high efficiency in gene recovery compared to available genomes of closely related species. Intriguingly, our transcriptomic datasets detected multiple paralogues in all phyla and in nearly all gene pathways, including housekeeping genes that are traditionally used in phylogenetic applications for their purported single-copy nature.ConclusionsWe generated the first study of comparative transcriptomics across multiple animal phyla (comparing two species per phylum in most cases), established the first Illumina-based transcriptomic datasets for sponge, nemertean, and sipunculan species, and generated a tractable catalogue of annotated genes (or gene fragments) and protein families for ten newly sequenced non-model organisms, some of commercial importance (i.e., Octopus vulgaris). These comprehensive sets of genes can be readily used for phylogenetic analysis, gene expression profiling, developmental analysis, and can also be a powerful resource for gene discovery. The characterization of the transcriptomes of such a diverse array of animal species permitted the comparison of sequencing depth, functional annotation, and efficiency of genomic sampling using the same pipelines, which proved to be similar for all considered species. In addition, the datasets revealed their potential as a resource for paralogue detection, a recurrent concern in various aspects of biological inquiry, including phylogenetics, molecular evolution, development, and cellular biochemistry.
Relationships between the five extant orders of centipedes have been considered solved based on morphology. Phylogenies based on samples of up to a few dozen genes have largely been congruent with the morphological tree apart from an alternative placement of one order, the relictual Craterostigmomorpha, consisting of two species in Tasmania and New Zealand. To address this incongruence, novel transcriptomic data were generated to sample all five orders of centipedes and also used as a test case for studying gene-tree incongruence. Maximum likelihood and Bayesian mixture model analyses of a data set composed of 1,934 orthologs with 45% missing data, as well as the 389 orthologs in the least saturated, stationary quartile, retrieve strong support for a sister-group relationship between Craterostigmomorpha and all other pleurostigmophoran centipedes, of which the latter group is newly named Amalpighiata. The Amalpighiata hypothesis, which shows little gene-tree incongruence and is robust to the influence of among-taxon compositional heterogeneity, implies convergent evolution in several morphological and behavioral characters traditionally used in centipede phylogenetics, such as maternal brood care, but accords with patterns of first appearances in the fossil record.
Previous phylogenetic analyses of the centipede order Scolopendromorpha indicated a fundamental division into blind and ocellate clades. These analyses corroborated the monophyly of most families and tribes but suggested that several species-rich, cosmopolitan genera in traditional and current classifications are polyphyletic. Denser taxon sampling is applied to a dataset of 122 morphological characters and sequences for four nuclear and mitochondrial loci. Phylogenetic analyses including 98 species and subspecies of Scolopendromorpha employ parsimony under dynamic and static homology schemes as well as maximum likelihood and Bayesian inference of multiple sequence alignments. The monotypic Australian genera Notiasemus and Kanparka nest within Cormocephalus and Scolopendra, respectively, and the New Caledonian Campylostigmus is likewise a clade within Cormocephalus. New World Scolopendra are more closely related to Hemiscolopendra and Arthrorhabdus than to Scolopendra s.s., which is instead closely allied to Asanada; the tribe Asanadini nests within Scolopendrini for molecular and combined datasets. The generic classification of Otostigmini has a poor fit to phylogenetic relationships, although nodal support within this tribe is weak. New synonymies are proposed for Ectonocryptopinae Shelley & Mercurio, 2005 (= Newportiinae Pocock, 1896), Asanadini Verhoeff, 1907 (= Scolopendrini Leach, 1814), and Kanparka Waldock & Edgecombe, 2012 (= Scolopendra Linnaeus, 1758). Scolopendrid systematics largely depicts incongruence between phylogeny and classification rather than between morphology and molecules.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.