20Analysis of genetic polymorphism is a powerful tool for epidemiological surveillance and research. Powerful 21 inference from pathogen genetic variation, however, is often restrained by limited access to representative 22 target DNA, especially in the study of obligate parasitic species for which ex vivo culture is resource-intensive 23 or bias-prone. Modern sequence capture methods enable pathogen genetic variation to be analyzed directly 24 from vector/host material but are often too complex and expensive for resource-poor settings where infectious 25 diseases prevail. This study proposes a simple, cost-effective 'genome-wide locus sequence typing' (GLST) 26 tool based on massive parallel amplification of information hotspots throughout the target pathogen genome. 27 The multiplexed polymerase chain reaction amplifies hundreds of different, user-defined genetic targets in a 28 single reaction tube, and subsequent agarose gel-based clean-up and barcoding completes library preparation 29 at under 4 USD per sample. Approximately 100 libraries can be sequenced together in one Illumina MiSeq 30 run. Our study generates a flexible GLST primer panel design workflow for Trypanosoma cruzi, the parasitic 31 agent of Chagas disease. We successfully apply our 203-target GLST panel to direct, culture-free 32 metagenomic extracts from triatomine vectors containing a minimum of 3.69 pg/µl T. cruzi DNA and further 33 elaborate on method performance by sequencing GLST libraries from T. cruzi reference clones representing 34 discrete typing units (DTUs) TcI, TcIII, TcIV, and TcVI. The 780 SNP sites we identify in the sample set 35 repeatably distinguish parasites infecting sympatric vectors and detect correlations between genetic and 36 geographic distances at regional (< 150 km) as well as continental scales. The markers also clearly separate 37 DTUs. We discuss the advantages, limitations and prospects of our method across a spectrum of 38 epidemiological research.
40Genome-wide single nucleotide polymorphism (SNP) analysis is a powerful and increasingly common 41 approach in the study and surveillance of infectious disease. Understanding patterns of SNP diversity within 42 pathogen genomes and across pathogen populations can resolve fundamental biological questions (e.g., 43 reproductive mechanisms in T. cruzi 1 , reconstruct past 2 and present transmission networks (e.g., 44 Staphylococcus infections within hospitals) 3 or identify the genetic bases of virulence 4,5 and resistance to drugs 45 (see examples from Plasmodium spp. 6,7 ). A number of obstacles, however, complicate access to 46 representative, genome-wide SNP information using modern sequencing tools. Micro-pathogens are often 47 sampled in low quantities and together with large amounts of host/vector tissue, microbiota, or environmental 48 DNA. Sequencing is rarely viable directly from the infection source and studies have often found it necessary 49 to isolate and culture the target organism to higher densities before extracting D...