Background
Genome-centric approaches are widely used to investigate microbial compositions, dynamics, ecology, and interactions within various environmental systems. Hundreds or even thousands of genomes could be retrieved in a single study contributed by the cost-effective short-read sequencing and developed assembly/binning pipelines. However, conventional binning methods usually yield highly fragmented draft genomes that limit our ability to comprehensively understand these microbial communities. Thus, to leverage advantage of both the long and short reads to retrieve more complete genomes from environmental samples is a must-do task to move this direction forward.
Results
Here, we used an iterative hybrid assembly (IHA) approach to reconstruct 49 metagenome-assembled genomes (MAGs), including 27 high-quality (HQ) and high-contiguity (HC) genomes with contig number ≤ 5, eight of which were circular finished genomes from a partial-nitritation anammox (PNA) reactor. These 49 recovered MAGs (43 MAGs encoding full-length rRNA, average N50 of 2.2 Mbp), represented the majority (92.3%) of the bacterial community. Moreover, the workflow retrieved HQ and HC MAGs even with an extremely low coverage (relative abundance < 0.1%). Among them, 34 MAGs could not be assigned to the genus level, indicating the novelty of the genomes retrieved using the IHA method proposed in this study. Comparative analysis of HQ MAG pairs reconstructed using two methods, i.e., hybrid and short reads only, revealed that identical genes in the MAG pairs represented 87.5% and 95.5% of the total gene inventory of hybrid and short reads only assembled MAGs, respectively. In addition, the first finished anammox genome of the genus Ca. Brocadia reconstructed revealed that there were two identical hydrazine synthase (hzs) genes, providing the exact gene copy number of this crucial phylomarker of anammox at the genome level.
Conclusions
Our results showcased the high-quality and high-contiguity genome retrieval performance and demonstrated the feasibility of complete genome reconstruction using the IHA workflow from the enrichment system. These (near-) complete genomes provided a high resolution of the microbial community, which might help to understand the bacterial repertoire of anammox-associated systems. Combined with other validation experiments, the workflow can enable a detailed view of the anammox or other similar enrichment systems.