Whole genome sequencing (WGS) is a very valuable resource to understand the evolutionary history of poorly known species. However, in organisms with large genomes, as most amphibians, WGS is still excessively challenging and transcriptome sequencing (RNA-seq) represents a cost-effective tool to explore genome-wide variability. Non-model organisms do not usually have a reference genome and the transcriptome must be assembled de-novo. We used RNA-seq to obtain the transcriptomic profile for Oreobates cruralis, a poorly known South American direct-developing frog. In total, 550,871 transcripts were assembled, corresponding to 422,999 putative genes. Of those, we identified 23,500, 37,349, 38,120 and 45,885 genes present in the Pfam, EggNOG, KEGG and GO databases, respectively. Interestingly, our results suggested that genes related to immune system and defense mechanisms are abundant in the transcriptome of O. cruralis.We also present a pipeline to assist with pre-processing, assembling, evaluating and functionally annotating a de-novo transcriptome from RNA-seq data of non-model organisms. Our pipeline guides the inexperienced user in an intuitive way through all the necessary steps to build de-novo transcriptome assemblies using readily available software and is freely available at: https://github.com/biomendi/TRANSCRIPTOME-
ASSEMBLY-PIPELINE/wikiPeerJ reviewing PDF | Abstract 50 51Whole genome sequencing (WGS) is a very valuable resource to understand the 52 evolutionary history of poorly known species. However, in organisms with large genomes, as 53 most amphibians, WGS is still excessively challenging and transcriptome sequencing (RNA-seq) 54 represents a cost-effective tool to explore genome-wide variability. Non-model organisms do not 55 usually have a reference genome and the transcriptome must be assembled de-novo. We used 56 RNA-seq to obtain the transcriptomic profile for Oreobates cruralis, a poorly known South 57 American direct-developing frog. In total, 550,871 transcripts were assembled, corresponding to 58 422,999 putative genes. Of those, we identified 23,500, 37,349, 38,120 and 45,885 genes present 59 in the Pfam, EggNOG, KEGG and GO databases, respectively. Interestingly, our results 60 suggested that genes related to immune system and defense mechanisms are abundant in the 61 transcriptome of O. cruralis. We also present a pipeline to assist with pre-processing, 62 assembling, evaluating and functionally annotating a de-novo transcriptome from RNA-seq data 63 of non-model organisms. Our pipeline guides the inexperienced user in an intuitive way through 64 all the necessary steps to build de-novo transcriptome assemblies using readily available software 65 and is freely available at: https://github.com/biomendi/TRANSCRIPTOME-ASSEMBLY-66 PIPELINE/wiki 67 68