A B S T R A C TMulti-locus phylogenetic studies of echinoderms based on Sanger and RNA-seq technologies and the fossil record have provided evidence for the Asterozoa-Echinozoa hypothesis. This hypothesis posits a sister relationship between asterozoan classes (Asteroidea and Ophiuroidea) and a similar relationship between echinozoan classes (Echinoidea and Holothuroidea). Despite this consensus around Asterozoa-Echinozoa, phylogenetic relationships within the class Asteroidea (sea stars or starfish) have been controversial for over a century. Open questions include relationships within asteroids and the status of the enigmatic taxon Xyloplax. Xyloplax is thought by some to represent a newly discovered sixth class of echinoderms -and by others to be an asteroid. To address these questions, we applied a novel workflow to a large RNA-seq dataset that encompassed a broad taxonomic and genomic sample. This study included 15 species sampled from all extant orders and 13 families, plus four ophiuroid species as an outgroup. To expand the taxonomic coverage, the study also incorporated five previously published transcriptomes and one previously published expressed sequence tags (EST) dataset. We developed and applied methods that used a range of alignment parameters with increasing permissiveness in terms of gap characters present within an alignment. This procedure facilitated the selection of phylogenomic data subsets from large amounts of transcriptome data. The results included 19 nested data subsets that ranged from 37 to 4,281 loci. Tree searches on all data subsets reconstructed Xyloplax as a velatid asteroid rather than a new class. This result implies that asteroid morphology remains labile well beyond the establishment of the body plan of the group. In the phylogenetic tree with the highest average asteroid nodal support several monophyletic groups were recovered. In this tree, Forcipulatida and Velatida are monophyletic and form a clade that includes Brisingida as sister to Forcipulatida. Xyloplax is consistently recovered as sister to Pteraster. Paxillosida and Spinulosida are each monophyletic, with Notomyotida as sister to the Paxillosida. Valvatida is recovered as paraphyletic. The results from other data subsets are largely consistent with these results. Our results support the hypothesis that the earliest divergence event among extant asteroids separated Velatida and Forcipulatacea from Valvatacea and Spinulosida.
Tissue inhibitors of metalloproteinases (TIMPs) help regulate the extracellular matrix (ECM) in animals, mostly by inhibiting matrix metalloproteinases (MMPs). They are important activators of mutable collagenous tissue (MCT), which have been extensively studied in echinoderms, and the four TIMP copies in humans have been studied for their role in cancer. To understand the evolution of TIMPs, we combined 405 TIMPs from an echinoderm transcriptome dataset built from 41 specimens representing all five classes of echinoderms with variants from protostomes and chordates. We used multiple sequence alignment with various stringencies of alignment quality to cull highly divergent sequences and then conducted phylogenetic analyses using both nucleotide and amino acid sequences. Phylogenetic hypotheses consistently recovered TIMPs as diversifying in the ancestral deuterostome and these early lineages continuing to diversify in echinoderms. The four vertebrate TIMPs diversified from a single copy in the ancestral chordate, all other copies being lost. Consistent with greater MCT needs owing to body wall liquefaction, evisceration, autotomy and reproduction by fission, holothuroids had significantly more TIMPs and higher read depths per contig. Ten cysteine residues, an HPQ binding site and several other residues were conserved in at least 70% of all TIMPs. The conservation of binding sites and the placement of echinoderm TIMPs involved in MCT modification suggest that ECM regulation remains the primary function of TIMP genes, although within this role there are a large number of specialized copies.
BackgroundOne of our goals for the echinoderm tree of life project (http://echinotol.org) is to identify orthologs suitable for phylogenetic analysis from next-generation transcriptome data. The current dataset is the largest assembled for echinoderm phylogeny and transcriptomics. We used RNA-Seq to profile adult tissues from 42 echinoderm specimens from 24 orders and 37 families. In order to achieve sampling members of clades that span key evolutionary divergence, many of our exemplars were collected from deep and polar seas.DescriptionA small fraction of the transcriptome data we produced is being used for phylogenetic reconstruction. Thus to make a larger dataset available to researchers with a wide variety of interests, we made a web-based application, EchinoDB (http://echinodb.uncc.edu). EchinoDB is a repository of orthologous transcripts from echinoderms that is searchable via keywords and sequence similarity.ConclusionsFrom transcripts we identified 749,397 clusters of orthologous loci. We have developed the information technology to manage and search the loci their annotations with respect to the Sea Urchin (Strongylocentrotus purpuratus) genome. Several users have already taken advantage of these data for spin-off projects in developmental biology, gene family studies, and neuroscience. We hope others will search EchinoDB to discover datasets relevant to a variety of additional questions in comparative biology.Electronic supplementary materialThe online version of this article (doi:10.1186/s12859-016-0883-2) contains supplementary material, which is available to authorized users.
HIV consensus sequences are used in various bioinformatic, evolutionary, and vaccine related research. Since the previous HIV-1 subtype and CRF consensus sequences were constructed in 2002, the number of publicly available HIV-1 sequences have grown exponentially, especially from non-EU and US countries. Here, we reconstruct 90 new HIV-1 subtype and CRF consensus sequences from 3,470 high-quality, representative, full genome sequences in the LANL HIV database. While subtypes and CRFs are unevenly spread across the world, in total 89 countries were represented. For consensus sequences that were based on at least 20 genomes, we found that on average 2.3% (range 0.8–10%) of the consensus genome site states changed from 2002 to 2021, of which about half were nucleotide state differences and the rest insertions and deletions. Interestingly, the 2021 consensus sequences were shorter than in 2002, and compared to 4,674 HIV-1 worldwide genome sequences, the 2021 consensuses were somewhat closer to the worldwide genome sequences, i.e., showing on average fewer nucleotide state differences. Some subtypes/CRFs have had limited geographical spread, and thus sampling of subtypes/CRFs is uneven, at least in part, due to the epidemiological dynamics. Thus, taken as a whole, the 2021 consensus sequences likely are good representations of the typical subtype/CRF genome nucleotide states. The new consensus sequences are available at the LANL HIV database.
It has been proposed that supertree approaches should be applied to large multilocus datasets to achieve computational tractability. Large datasets such as those derived from phylogenomics studies can be broken into many locus‐specific tree searches and the resulting trees can be stitched together via a supertree method. Using simulated data, workers have reported that they can rapidly construct a supertree that is comparable to the results of heuristic tree search on the entire dataset. To test this assertion with organismal data, we compare tree length under the parsimony criterion and computational time for 20 multilocus datasets using supertree (SuperFine and SuperTriplets) and supermatrix (heuristic search in TNT) approaches. Tree length and computational times were compared among methods using the Wilcoxon matched‐pairs signed rank test. Supermatrix searches produced significantly shorter trees than either supertree approach (SuperFine or SuperTriplets; P < 0.0002 in both cases). Moreover, the processing time of supermatrix search was significantly lower than SuperFine+locus‐specific search (P < 0.01) but roughly equivalent to that of SuperTriplets+locus‐specific search (P > 0.4, not significant). In conclusion, we show by using real rather than simulated data that there is no basis, either in time tractability or in tree length, for use of supertrees over heuristic tree search using a supermatrix for phylogenomics.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.