High-throughput amplicon sequencing (HTAS) of conserved DNA regions is a powerful technique to characterize microbial communities. Recently, spike-in mock communities have been used to measure accuracy of sequencing platforms and data analysis pipelines. To assess the ability of sequencing platforms and data processing pipelines using fungal internal transcribed spacer (ITS) amplicons, we created two ITS spike-in control mock communities composed of cloned DNA in plasmids: a biological mock community, consisting of ITS sequences from fungal taxa, and a synthetic mock community (SynMock), consisting of non-biological ITS-like sequences. Using these spike-in controls we show that: (1) a non-biological synthetic control (e.g., SynMock) is the best solution for parameterizing bioinformatics pipelines, (2) pre-clustering steps for variable length amplicons are critically important, (3) a major source of bias is attributed to the initial polymerase chain reaction (PCR) and thus HTAS read abundances are typically not representative of starting values. We developed AMPtk, a versatile software solution equipped to deal with variable length amplicons and quality filter HTAS data based on spike-in controls. While we describe herein a non-biological SynMock community for ITS sequences, the concept and AMPtk software can be widely applied to any HTAS dataset to improve data quality.
Regions of rDNA are commonly used to infer phylogenetic relationships among fungal species and as DNA barcodes for identification. These regions occur in large tandem arrays, and concerted evolution is believed to reduce intragenomic variation among copies within these arrays, although some variation still might exist. Phylogenetic studies typically use consensus sequencing, which effectively conceals most intragenomic variation, but cloned sequences containing intragenomic variation are becoming prevalent in DNA databases. To understand effects of using cloned rDNA sequences in phylogenetic analyses we amplified and cloned the ITS region from pure cultures of six Laetiporus species and one Wolfiporia species (Basidiomycota, Polyporales). An average of 66 clones were selected randomly and sequenced from 21 cultures, producing a total of 1399 interpretable sequences. Significant variation (≥ 5% variation in sequence similarity) was observed among ITS copies within six cultures from three species clades (L. cincinnatus, L. sp. clade J, and Wolfiporia dilatohypha) and phylogenetic analyses with the cloned sequences produced different trees relative to analyses with consensus sequences. Cloned sequences from L. cincinnatus fell into more than one species clade and numerous cloned L. cincinnatus sequences fell into entirely new clades, which if analyzed on their own most likely would be recognized as "undescribed" or "novel" taxa. The use of a 95% cut off for defining operational taxonomic units (OTUs) produced seven Laetiporus OTUs with consensus ITS sequences and 20 OTUs with cloned ITS sequences. The use of cloned rDNA sequences might be problematic in fungal phylogenetic analyses, as well as in fungal bar-coding initiatives and efforts to detect fungal pathogens in environmental samples.
DNA analysis of predator faeces using high-throughput amplicon sequencing (HTS) enhances our understanding of predator-prey interactions. However, conclusions drawn from this technique are constrained by biases that occur in multiple steps of the HTS workflow. To better characterize insectivorous animal diets, we used DNA from a diverse set of arthropods to assess PCR biases of commonly used and novel primer pairs for the mitochondrial gene, cytochrome oxidase C subunit 1 (COI). We compared diversity recovered from HTS of bat guano samples using a commonly used primer pair "ZBJ" to results using the novel primer pair "ANML." To parameterize our bioinformatics pipeline, we created an arthropod mock community consisting of single-copy (cloned) COI sequences. To examine biases associated with both PCR and HTS, mock community members were combined in equimolar amounts both pre-and post-PCR. We validated our system using guano from bats fed known diets and using composite samples of morphologically identified insects collected in pitfall traps. In PCR tests, the ANML primer pair amplified 58 of 59 arthropod taxa (98%), whereas ZBJ amplified 24-40 of 59 taxa (41%-68%). Furthermore, in an HTS comparison of field-collected samples, the ANML primers detected nearly fourfold more arthropod taxa than the ZBJ primers. The additional arthropods detected include medically and economically relevant insect groups such as mosquitoes. Results revealed biases at both the PCR and sequencing levels, demonstrating the pitfalls associated with using HTS read numbers as proxies for abundance. The use of an arthropod mock community allowed for improved bioinformatics pipeline parameterization. K E Y W O R D SAMPtk, arthropod mock community, bat guano, dietary analysis, insectivore, next-generation sequencing *Indicates shared first authorship based on equal contributions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.