32New long read sequencing technologies offer huge potential for effec-33 tive recovery of complete, closed genomes from complex microbial com-34 munities. Using long read (MinION) obtained from an ensemble of ac-35 tivated sludge enrichment bioreactors, we 1 ) describe new methods for 36 validating long read assembled genomes using their counterpart short read 37 metagenome assembled genomes; 2 ) assess the influence of different cor-38 rection procedures on genome quality and predicted gene quality and 3 ) 39 contribute 21 new closed or complete genomes of community members, in-40 cluding several species known to play key functional roles in wastewater bio-41 processes: specifically microbes known to exhibit the polyphosphate-and 42 glycogen-accumulating organism phenotypes (namely Accumulibacter and 43 Dechloromonas, and Micropruina and Defluviicoccus, respectively), and fil-44 amentous bacteria (Thiothrix) associated with the formation and stability 45 of activated sludge flocs. Our findings further establish the feasibility of 46 long read metagenome-assembled genome recovery, and demonstrate the 47 utility of parallel sampling of moderately complex enrichments communi-48 ties for recovery of genomes of key functional species relevant for the study 49 of complex wastewater treatment bioprocesses.50The development of long read sequencing technologies, such as the Oxford Nanopore Tech-51 nology MinION and Pacific Biosciences SMRT are presenting new opportunities for the 52 effective recovery of complete, closed genomes [1, 2]. While these new approaches have 53 been mostly applied to single species isolates [3, 4], the ability of this new methodology to 54 recover genomes of member taxa from complex microbial communities (microbiome) data 55 is now actively being explored.
56After long read sequencing technologies first became available, several studies pioneered 57 the collection of long read data, or combined long and short read data, on complex microbial 58 communities, for example from moderately to highly enriched bioreactor communities [5, 59 6], co-culture enrichments [7], marine holobionts [8] or from full scale anaerobic digester 60 communities [9], as well as several datasets which provided benchmarking data from long 61 and short read sequencing of mock communities [10, 11, 12]. New long read analysis 62 methods [13, 14] and binning algorithms designed for long read metagenome data [15] have 63 also appeared, anticipating the future expansion of metagenome data generated from these 64 new instruments. More recent studies [16, 17, 18, 19, 20, 21, 22] have collectively established 65 that full length (or near full length genomes) can be recovered from long read sequencing 66 of complex communities, which sets the stage for further development of genome-resolved 67 long read metagenomics.
68Here we extend our previous work [16, 22] on recovering metagenome-assembled genomes 69 from long read data obtained from enrichment (continuous culture) reactors inoculated 70 with activated sludge microbial c...