Background
Venom has evolved in parallel in multiple animals for the purpose of self-defense, prey capture or both. These venoms typically consist of highly complex mixtures of toxins: diverse bioactive peptides and/or proteins each with a specific pharmacological activity. Because of their specificity, they can be used as experimental tools to study cell mechanisms and develop novel medicines and drugs. It is therefore potentially valuable to explore the venoms of various animals to characterize their toxins and identify novel toxin-families. This study focuses on the annotation and exploration of the transcriptomes of six scorpion species from three different families. The transcriptomes were annotated with a custom-built automated pipeline, primarily consisting of Basic Local Alignment Search Tool searches against UniProt databases and filter steps based on transcript coverage.
Results
We annotated the transcriptomes of four scorpions from the family Buthidae, one from Iuridae and one from Diplocentridae using our annotation pipeline. We found that the four buthid scorpions primarily produce disulfide-bridged ion-channel targeting toxins, while the non-buthid scorpions have a higher abundance of non-disulfide-bridged toxins. Furthermore, analysis of the “unidentified” transcripts resulted in the discovery of six novel putative toxin families containing a total of 37 novel putative toxins. Additionally, 33 novel toxins in existing toxin-families were found. Lastly, 19 novel putative secreted proteins without toxin-like disulfide bonds were found.
Conclusions
We were able to assign most transcripts to a toxin family and classify the venom composition for all six scorpions. In addition to advancing our fundamental knowledge of scorpion venomics, this study may serve as a starting point for future research by facilitating the identification of the venom composition of scorpions and identifying novel putative toxin families.
Electronic supplementary material
The online version of this article (10.1186/s12864-019-6013-6) contains supplementary material, which is available to authorized users.