Apicomplexans are a group of microbial eukaryotes that contain some of the most wellstudied parasites, including widespread intracellular pathogens of mammals such as Toxoplasma and Plasmodium (the agent of malaria), and emergent pathogens like Cryptosporidium and Babesia. Decades of research have illuminated the pathogenic mechanisms, molecular biology, and genomics of model apicomplexans, but we know surprisingly little about their diversity and distribution in natural environments. In this study we analyze the distribution of apicomplexans across a range of both host-associated and free-living environments, covering animal hosts from cnidarians to mammals, and ecosystems from soils to fresh and marine waters. Using publicly available small subunit (SSU) rRNA gene databases, high-throughput environmental sequencing (HTES) surveys such as Tara Oceans and VAMPS, as well as our own generated HTES data, we developed an apicomplexan reference database, which includes the largest apicomplexan SSU rRNA tree available to date and encompasses comprehensive sampling of this group and their closest relatives. This tree allowed us to identify and correct incongruences in the molecular identification of sequences, particularly within the hematozoans and the gregarines.Analyzing the diversity and distribution of apicomplexans in HTES studies with this curated reference database also showed a widespread, and quantitatively important, presence of apicomplexans across a variety of free-living environments. These data allow us to describe a remarkable molecular diversity of this group compared with our current knowledge, especially when compared with that identified from described apicomplexan species. This revision is most striking in marine environments, where potentially the most diverse apicomplexans apparently exist, but have not yet been formally recognized. The new database will be useful for both microbial ecology and epidemiological studies, and provide valuable reference for medical and veterinary diagnosis especially in cases of emerging, zoonotic, and cryptic infections.
Author SummaryApicomplexans are important animal and human parasites, but little is known about their distribution and diversity in the natural environment. We have developed a phylogenetically informed and manually curated reference database for the SSU rRNA barcode gene, and analyzed all publicly available sequences from a broad range of environments, providing a needed framework to analyze high-throughput environmental sequencing (HTES) data. The reference database and the amplicon sequences identified a striking diversity of apicomplexans in the environment, including habitats not usually associated with this group (e.g. open ocean). The phylogenetic framework will underpin microbial ecology studies and provide valuable resources for medical and veterinary biology.
Main text