The first two authors contributed equally to this work. ✝ The last two authors contributed equally to this work.
AbstractBackground: Targeted next generation sequencing offers the potential for consistent, deep coverage of information rich genomic regions to characterize polyclonal Plasmodium falciparum infections. However, methods to identify and sequence these genomic regions are currently limited.
Methods:A bioinformatic pipeline and multiplex methods were developed to identify and simultaneously sequence 100 targets and applied to dried blood spot (DBS) controls and field isolates from Mozambique. For comparison, WGS data were generated for the same controls.Results: Using publicly available genomes, 4465 high diversity genomic regions suited for targeted sequencing were identified, representing the P. falciparum heterozygome. For this study, 93 microhaplotypes with high diversity (median HE = 0.7) were selected along with 7 drug resistance loci. The sequencing method achieved very high coverage (median 99%), specificity (99.8%) and sensitivity (90% for haplotypes with 5% within sample frequency in DBS with 100 parasites/µL). In silico analyses revealed that microhaplotypes provided much higher resolution to discriminate related from unrelated polyclonal infections than biallelic SNP barcodes.
Discussion:The bioinformatic and laboratory methods outlined here provide a flexible tool for efficient, low-cost, high throughput interrogation of the P. falciparum genome, and can be tailored to simultaneously address multiple questions of interest in various epidemiological settings.