Background The giant squid (Architeuthis dux; Steenstrup, 1857) is an enigmatic giant mollusc with a circumglobal distribution in the deep ocean, except in the high Arctic and Antarctic waters. The elusiveness of the species makes it difficult to study. Thus, having a genome assembled for this deep-sea–dwelling species will allow several pending evolutionary questions to be unlocked. Findings We present a draft genome assembly that includes 200 Gb of Illumina reads, 4 Gb of Moleculo synthetic long reads, and 108 Gb of Chicago libraries, with a final size matching the estimated genome size of 2.7 Gb, and a scaffold N50 of 4.8 Mb. We also present an alternative assembly including 27 Gb raw reads generated using the Pacific Biosciences platform. In addition, we sequenced the proteome of the same individual and RNA from 3 different tissue types from 3 other species of squid (Onychoteuthis banksii, Dosidicus gigas, and Sthenoteuthis oualaniensis) to assist genome annotation. We annotated 33,406 protein-coding genes supported by evidence, and the genome completeness estimated by BUSCO reached 92%. Repetitive regions cover 49.17% of the genome. Conclusions This annotated draft genome of A. dux provides a critical resource to investigate the unique traits of this species, including its gigantism and key adaptations to deep-sea environments.
Antimicrobial peptides (AMPs) have appeared as promising compounds to treat a wide range of diseases. Their clinical potentialities reside in the wide range of mechanisms they can use for both killing microbes and modulating immune responses. However, the hugeness of the AMPs' chemical space (AMPCS), represented by more than 10 65 unique sequences, has represented a big challenge for the discovery of new promising therapeutic peptides and for the identification of common structural motifs. Here, we introduce network science and a similarity searching approach to discover new promising AMPs, specifically antiparasitic peptides (APPs). We exploited the network-based representation of APPs' chemical space (APPCS) to retrieve valuable information by using three network types: chemical space (CSN), half-space proximal (HSPN), and metadata (METN). Some centrality measures were applied to identify in each network the most important and nonredundant peptides. Then, these central peptides were considered as queries (Qs) in group fusion similarity-based searches against a comprehensive collection of known AMPs, stored in the graph database StarPepDB, to propose new potential APPs. The performance of the resulting multiquery similarity-based search models (mQSSMs) was evaluated in five benchmarking data sets of APP/non-APPs. The predictions performed by the best mQSSM showed a strong-tovery-strong performance since their external Matthews correlation coefficient (MCC) values ranged from 0.834 to 0.965. Outstanding MCC values (>0.85) were attained by the mQSSM with 219 Qs from both networks CSN and HSPN with 0.5 as similarity threshold in external data sets. Then, the performance of our best mQSSM was compared with the APPs prediction servers AMPDiscover and AMPFun. The proposed model showed its relevance by outperforming state-of-the-art machine learning models to predict APPs. After applying the best mQSSM and additional filters on the non-APP space from StarPepDB, 95 AMPs were repurposed as potential APP hits. Due to the high sequence diversity of these peptides, different computational approaches were applied to identify relevant motifs for searching and designing new APPs. Lastly, we identified 11 promising APP lead candidates by using our best mQSSMs together with diversity-based network analyses, and 24 web servers for activity/toxicity and drug-like properties. These results support that network-based similarity searches can be an effective and reliable strategy to identify APPs. The proposed models and pipeline are freely available through the StarPep toolbox software at http://mobiosd-hub.com/starpep.
Marine turtles represent an ancient lineage of marine vertebrates that evolved from terrestrial ancestors over 100 MYA, yet the genomic basis of the unique physiological and ecological traits enabling these species to thrive in diverse marine habitats remain largely unknown. Additionally, many populations have declined drastically due to anthropogenic activities over the past two centuries, and their recovery is a high global conservation priority. We generated and analyzed high-quality reference genomes for green (Chelonia mydas) and leatherback (Dermochelys coriacea) turtles, representing the two extant marine turtle families (MRCA ~60 MYA). Generally, these genomes are highly syntenic and homologous. Non-collinearity was associated with higher copy numbers of immune, zinc-finger, or olfactory receptor (OR) genes in green turtles. Gene family analyses suggested that ORs related to waterborne odorants have expanded in green turtles and contracted in leatherbacks, which may underlie immunological and sensory adaptations assisting navigation and occupancy of neritic versus pelagic environments, and diet specialization. Microchromosomes showed reduced collinearity, and greater gene content, heterozygosity, and genetic distances between species, supporting their critical role in vertebrate evolutionary adaptation. Finally, demographic history and diversity analyses showed stark contrasts between species, indicating that leatherback turtles have had a low yet stable effective population size, extremely low diversity when compared to other reptiles, and a higher proportion of deleterious variants, reinforcing concern over the persistence of this species under future climate scenarios. These highly contiguous genomes provide invaluable resources for advancing our understanding of evolution and conservation best practices in an imperiled vertebrate lineage.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.