Bacteriophages typically have small genomes 1 and depend on their bacterial hosts for replication 2 . Here we sequenced DNA from diverse ecosystems and found hundreds of phage genomes with lengths of more than 200 kilobases (kb), including a genome of 735 kb, which is-to our knowledge-the largest phage genome to be described to date. Thirty-five genomes were manually curated to completion (circular and no gaps). Expanded genetic repertoires include diverse and previously undescribed CRISPR-Cas systems, transfer RNAs (tRNAs), tRNA synthetases, tRNA-modification enzymes, translation-initiation and elongation factors, and ribosomal proteins. The CRISPR-Cas systems of phages have the capacity to silence host transcription factors and translational genes, potentially as part of a larger interaction network that intercepts translation to redirect biosynthesis to phage-encoded functions. In addition, some phages may repurpose bacterial CRISPR-Cas systems to eliminate competing phages. We phylogenetically define the major clades of huge phages from human and other animal microbiomes, as well as from oceans, lakes, sediments, soils and the built environment. We conclude that the large gene inventories of huge phages reflect a conserved biological strategy, and that the phages are distributed across a broad bacterial host range and across Earth's ecosystems.Phages-viruses that infect bacteria-are considered distinct from cellular life owing to their inability to carry out most biological processes required for reproduction. They are agents of ecosystem change because they prey on specific bacterial populations, mediate lateral gene transfer, alter host metabolism and redistribute bacterially derived compounds through cell lysis 2-4 . They spread antibiotic resistance 5 and disperse pathogenicity factors that cause disease in humans and animals 6,7 . Most knowledge about phages is based on laboratorystudied examples, the vast majority of which have genomes that are a few tens of kb in length. Widely used isolation-based methods select against large phage particles, and they can be excluded from phage concentrates obtained by passage through 100-nm or 200-nm filters 1 . In 2017, only 93 isolated phages with genomes that were more than 200 kb in length were published 1 . Sequencing of whole-community DNA can uncover phage-derived fragments; however, large genomes can still escape detection owing to fragmentation 8 . A new clade of human-and animal-associated megaphages was recently described on the basis of genomes that were manually curated to completion from metagenomic datasets 9 . This finding prompted us to carry out a more-comprehensive analysis of microbial communities to evaluate the prevalence, diversity and ecosystem distribution of phages with large genomes. Previously, phages with genomes of more than 200 kb have been referred to as 'jumbophages' 1 or, in the case of phages with genomes of more than 500 kb, as megaphages 9 . As the set reconstructed here span both size ranges we refer to them simply as 'huge phage...
Methanogenesis is an ancient metabolism of key ecological relevance, with direct impact on the evolution of Earth’s climate. Recent results suggest that the diversity of methane metabolisms and their derivations have probably been vastly underestimated. Here, by probing thousands of publicly available metagenomes for homologues of methyl-coenzyme M reductase complex (MCR), we have obtained ten metagenome-assembled genomes (MAGs) belonging to potential methanogenic, anaerobic methanotrophic and short-chain alkane oxidizing archaea. Five of these MAGs represent under-sampled (e.g., Verstraetearchaeota, Methanonatronarchaeia, ANME-1) or previously genomically undescribed (ANME-2c) archaeal lineages. The remaining five MAGs correspond to lineages that are only distantly related to previously known methanogens and span the entire archaeal phylogeny. Comprehensive comparative annotation significantly expands the metabolic diversity and energy conservation systems of MCR-bearing archaea. It also suggests the potential existence of a yet uncharacterized type of methanogenesis linked to short-chain alkane/fatty acid oxidation in a previously undescribed class of archaea (‘ Ca . Methanoliparia’). We redefine a common core of marker genes specific to methanogenic, anaerobic methanotrophic and short-chain alkane-oxidizing archaea, and propose a possible scenario for the evolutionary and functional transitions that led to the emergence of such metabolic diversity.
Genomes are an integral component of the biological information about an organism; thus, the more complete the genome, the more informative it is. Historically, bacterial and archaeal genomes were reconstructed from pure (monoclonal) cultures, and the first reported sequences were manually curated to completion. However, the bottleneck imposed by the requirement for isolates precluded genomic insights for the vast majority of microbial life. Shotgun sequencing of microbial communities, referred to initially as community genomics and subsequently as genome-resolved metagenomics, can circumvent this limitation by obtaining metagenome-assembled genomes (MAGs); but gaps, local assembly errors, chimeras, and contamination by fragments from other genomes limit the value of these genomes. Here, we discuss genome curation to improve and, in some cases, achieve complete (circularized, no gaps) MAGs (CMAGs). To date, few CMAGs have been generated, although notably some are from very complex systems such as soil and sediment. Through analysis of about 7000 published complete bacterial isolate genomes, we verify the value of cumulative GC skew in combination with other metrics to establish bacterial genome sequence accuracy. The analysis of cumulative GC skew identified potential misassemblies in some reference genomes of isolated bacteria and the repeat sequences that likely gave rise to them. We discuss methods that could be implemented in bioinformatic approaches for curation to ensure that metabolic and evolutionary analyses can be based on very high-quality genomes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.