Summary: Microbial community analysis using 16S rRNA gene amplicon sequencing is the backbone of many microbial ecology studies. Several approaches and pipelines exist for processing the raw data generated through DNA sequencing and convert the data into OTU-tables. Here we present ampvis2, an R package designed for analysis of microbial community data in OTU-table format with focus on simplicity, reproducibility, and sample metadata integration, with a minimal set of intuitive commands. Unique features include flexible heatmaps and simplified ordination. By generating plots using the ggplot2 package, ampvis2 produces publication-ready figures that can be easily customised. Furthermore, ampvis2 includes features for interactive visualisation, which can be convenient for larger, more complex data.
Microbial communities are responsible for biological wastewater treatment, but our knowledge of their diversity and function is still poor. Here, we sequence more than 5 million high-quality, full-length 16S rRNA gene sequences from 740 wastewater treatment plants (WWTPs) across the world and use the sequences to construct the ‘MiDAS 4’ database. MiDAS 4 is an amplicon sequence variant resolved, full-length 16S rRNA gene reference database with a comprehensive taxonomy from domain to species level for all sequences. We use an independent dataset (269 WWTPs) to show that MiDAS 4, compared to commonly used universal reference databases, provides a better coverage for WWTP bacteria and an improved rate of genus and species level classification. Taking advantage of MiDAS 4, we carry out an amplicon-based, global-scale microbial community profiling of activated sludge plants using two common sets of primers targeting regions of the 16S rRNA gene, revealing how environmental conditions and biogeography shape the activated sludge microbiota. We also identify core and conditionally rare or abundant taxa, encompassing 966 genera and 1530 species that represent approximately 80% and 50% of the accumulated read abundance, respectively. Finally, we show that for well-studied functional guilds, such as nitrifiers or polyphosphate-accumulating organisms, the same genera are prevalent worldwide, with only a few abundant species in each genus.
High-throughput 16S rRNA gene amplicon sequencing is an essential method for studying the diversity and dynamics of microbial communities. However, this method is presently hampered by the lack of high-identity reference sequences for many environmental microbes in the public 16S rRNA gene reference databases and by the absence of a systematic and comprehensive taxonomy for the uncultured majority. Here, we demonstrate how high-throughput synthetic long-read sequencing can be applied to create ecosystem-specific full-length 16S rRNA gene amplicon sequence variant (FL-ASV) resolved reference databases that include high-identity references (>98.7% identity) for nearly all abundant bacteria (>0.01% relative abundance) using Danish wastewater treatment systems and anaerobic digesters as an example. In addition, we introduce a novel sequence identity-based approach for automated taxonomy assignment (AutoTax) that provides a complete seven-rank taxonomy for all reference sequences, using the SILVA taxonomy as a backbone, with stable placeholder names for unclassified taxa. The FL-ASVs are perfectly suited for the evaluation of taxonomic resolution and bias associated with primers commonly used for amplicon sequencing, allowing researchers to choose those that are ideal for their ecosystem. Reference databases processed with AutoTax greatly improves the classification of short-read 16S rRNA ASVs at the genus- and species-level, compared with the commonly used universal reference databases. Importantly, the placeholder names provide a way to explore the unclassified environmental taxa at different taxonomic ranks, which in combination with in situ analyses can be used to uncover their ecological roles.
The assembly of bacterial communities in wastewater treatment plants (WWTPs) is affected by immigration via wastewater streams, but the impact and extent of bacterial immigrants are still unknown. Here, we quantify the effect of immigration at the species level in 11 Danish full-scale activated sludge (AS) plants. All plants have different source communities but have very similar process design, defining the same overall environmental growth conditions. The AS community composition in each plant was strongly reflected by the corresponding influent wastewater (IWW) microbial composition. Most species in AS across the plants were detected and quantified in the corresponding IWW, allowing us to identify their fate in the AS: growing, disappearing, or surviving. Most of the abundant species in IWW disappeared in AS, so their presence in the AS biomass was only due to continuous mass-immigration. In AS, most of the abundant growing species were present in the IWW at very low abundances. We predicted the AS species abundances from their abundance in IWW by using a partial least square regression model. Some species in AS were predicted by their own abundance in IWW, while others by multiple species abundances. Detailed analyses of functional guilds revealed different prediction patterns for different species. We show, in contrast to the present understanding, that the AS microbial communities were strongly controlled by the IWW source community and could be quantitatively predicted by taking into account immigration. This highlights a need to revise the way we understand, design, and manage the microbial communities in WWTPs.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.