“…Briefly, the fastq data were first evaluated for their quality using FastQC_0.10.1(multi-file). The high quality fastq data were then mapped to the TAIR10 genome using TopHat2-SE and the following parameter settings: reference genome, Arabidopsis thaliana [Mouse-ear cress] (Ensembl 14); reference annotations, Arabidopsis thaliana (Mouse-ear cress) (Ensembl 14); minimum length of read segments, 20; minimum isoform fraction, 0.15; Bowtie 2 speed and sensitivity, sensitive; maximum intron length that may be found during split-segment search, 10,000; minimum intron length, 50; Bowtie version, 2.1.0; anchor length, 8; number of mismatches allowed in each segment alignment for reads mapped independently, 2; maximum intron length, 5000; maximum number of mismatches that can appear in the anchor region of spliced alignment, 0; maximum number of alignments to be allowed, 20; minimum intron length that may be found during split-segment search, 50; TopHat version, 2.0.9 (Kim et al, 2013). The bam files generated from TopHat analysis were further processed to remove the duplicate reads and reads aligned to multiple genomic positions using R scripts with Bioconductor packages as follows: the R code: > library(GenomicAlignments); > library(GenomicRanges); > library(Rsamtools); > library(rtracklayer); > flag0 <-scanBamFlag(isDuplicate=FALSE, isNotPassingQualityControls=FALSE); > param0 <-ScanBamParam(flag=flag0, what="seq"); > ; > x <-readGAlignments("input.bam", use.names=TRUE, param=param0); > dup <-duplicated(mcols(x)$seq); > table(dup); > y <-x[!dup]; > export(y, BamFile("output.bam")).…”