Plasmodium parasites undergo several major developmental transitions during their complex lifecycle, which are enabled by precisely ordered gene expression programs. Transcriptomes from the 48-hour blood stages of the major human malaria parasite Plasmodium falciparum have been described using cDNA microarrays and RNA-seq, but these assays have not always performed well within non-coding regions, where the AT-content is often 90-95%. We developed a directional, amplification-free RNA-seq protocol (DAFT-seq) to reduce bias against AT-rich cDNA, which we have applied to three strains of P. falciparum (3D7, HB3 and IT). While strain-specific differences were detected, overall there is strong conservation between the transcriptional profiles. For the 3D7 reference strain, transcription was detected from 89% of the genome, with over 75% of the genome transcribed into mRNAs. These datasets allowed us to refine the 5' and 3' untranslated regions (UTRs), which can be variable, long (>1,000 nt), and often overlap those of adjacent transcripts. We also find that transcription from bidirectional promoters frequently results in non-coding, antisense transcripts. By capturing the 5' ends of mRNAs, we reveal both constant and dynamic use of transcriptional start sites across the intraerythrocytic developmental cycle resulting in an updated view of the P. falciparum transcriptome.
Materials and MethodsParasite culture and RNA extraction P. falciparum clone 3D7 was cultured in O+ human erythrocytes and 10% human serum in RPMI-based media, using standard methods (29) . RNA extractions used the TRIzol reagent as previously described (30) . RNA was quality controlled and quantified using an Agilent Bioanalyzer 2100 Nano RNA chip.Directional, Amplification-Free RNA-seq ( DAFT-seq) libraries PolyA+ RNA (mRNA) was selected using magnetic oligo-d(T) beads. Reverse transcription using Superscript II (Life) was primed using oligo d(T) primers, then second strand cDNA synthesis included dUTP. The resulting cDNA was fragmented using a Covaris AFA sonicator. A "with-bead" protocol was used for dA-tailing, end repair and adapter ligation (NEB) using "PCR-free" barcoded sequencing adaptors (Bioo Scientific, similar to Korarewa et al.). After 2 rounds of SPRI cleanup the libraries were eluted in EB buffer and USER enzyme mix (NEB) was used to digest the second strand cDNA, generating directional libraries. The libraries were quantified by qPCR and sequenced on an Illumina HiSeq2000, generating 100bp paired-end reads. Reads were mapped using TopHat2 (31) , using directional parameters and a maximum intron size of 5,000 nt.
5UTR-seq librariesPolyA+ RNA was isolated using oligo d(T)-coated magnetic beads. Superscript II reverse transcriptase was used to synthesise first strand cDNA using oligo-d(T) primers and in the presence of template switching oligos, which had the same sequence as those in the Smart-seq2 protocol (32, 33) . Template-switching oligos (TSOs) were used to "tag" the end of the cDNA sequences; this tag is used to prime second str...