tRNA fragments (tRFs) are a class of small non-coding RNAs (sncRNAs) derived from tRNAs. tRFs are highly abundant in many cell types including stem cells and cancer cells, and are found in all domains of life. Beyond translation control, tRFs have several functions ranging from transposon silencing to cell proliferation control. However, the analysis of tRFs presents specific challenges and their biogenesis is not well understood. They are very heterogeneous and highly modified by numerous post-transcriptional modifications. Here we describe a bioinformatic pipeline to study tRFs populations and shed light onto tRNA fragments biogenesis. Indeed, we used small RNAs Illumina sequencing datasets extracted from wild type and mutant Drosophila ovaries affecting two different highly conserved steps of tRNA biogenesis: 5'pre-tRNA processing (RNase-P subunit Rpp30) and tRNA 2'-O-methylation (CG7009 and CG5220). Using our pipeline, we show how defects in tRNA biogenesis affect nuclear and mitochondrial tRFs populations and other small non-coding RNAs biogenesis, such as small nucleolar RNAs (snoRNAs). This tRF analysis workflow will advance the current understanding of tRFs biogenesis, which is crucial to better comprehend tRFs roles and their implication in human pathology.
MATERIALS AND METHODS
Fly stocksFly stocks are described in (Molla-Herman et al., 2015;M. Angelova, 2019).
RNA extraction from ovariesRNA was extracted from Drosophila ovaries following standard methods detailed in (Molla-Herman et al., 2015;M. Angelova, 2019).
Small RNA sequencingRNA samples of 3-5 µg were used for High-throughput sequencing using Illumina HiSeq, 10% single-reads lane 1×50 bp. (Fasteris). 15-29 nt RNAs sequences excluding rRNA (riboZero) were sequenced. All the analyses were performed with Galaxy tools https://mississippi.snv.jussieu.fr. Data set deposition is described in (Molla-Herman et al., 2015;M. Angelova, 2019). European Nucleotide Archive (ENA) of the EMBL-EBI (http://www.ebi.ac.uk/ena), accession numbers: PRJEB10569 (Rpp30 mutants), PRJEB35301 and PRJEB35713 (Nm mutants).
Clipping and concatenationRaw data were used for clipping the adaptors [Clip adapter (Galaxy-Version 2.3.0, owner: artbio)] and FASTQ quality control was performed (FastQC Read Quality reports (Galaxy-Version 0.72)). Since replicates were homogeneous in quality and analysis (replicates for CG7009* heterozygous and homozygous, triplicates for CG7009*-CG5220* double mutants) we merged them [Concatenate multiple datasets tailto-head (Galaxy-Version 1.4.1, owner: artbio) to have single fasta files. CG7009*/Def9487 was used to obtain normalization numbers but is not shown in the figures.
Data normalization using DeSeq miRNA countsData were normalized with bank Size Factors (SF) obtained by using [DESeq geometrical normalization (Galaxy-Version 1.0.1, owner: artbio)] with miRNA counts obtained using [miRcounts (Galaxy-Version 1.3.2)], allowing 0 mismatch (MM). Then, 1/SF values were used in Galaxy small RNA maps (Sup. Fig. 5).
Data normalization with DeSeq using ...