Shotgun sequencing enables the reconstruction of genomes from complex microbial communities, but because assembly does not reconstruct entire genomes, it is necessary to bin genome fragments. Here we present CONCOCT, a new algorithm that combines sequence composition and coverage across multiple samples, to automatically cluster contigs into genomes. We demonstrate high recall and precision on artificial as well as real human gut metagenome data sets.
BackgroundMicrobes are main drivers of biogeochemical cycles in oceans and lakes. Although the genome is a foundation for understanding the metabolism, ecology and evolution of an organism, few bacterioplankton genomes have been sequenced, partly due to difficulties in cultivating them.ResultsWe use automatic binning to reconstruct a large number of bacterioplankton genomes from a metagenomic time-series from the Baltic Sea, one of world’s largest brackish water bodies. These genomes represent novel species within typical freshwater and marine clades, including clades not previously sequenced. The genomes’ seasonal dynamics follow phylogenetic patterns, but with fine-grained lineage-specific variations, reflected in gene-content. Signs of streamlining are evident in most genomes, and estimated genome sizes correlate with abundance variation across filter size fractions. Comparing the genomes with globally distributed metagenomes reveals significant fragment recruitment at high sequence identity from brackish waters in North America, but little from lakes or oceans. This suggests the existence of a global brackish metacommunity whose populations diverged from freshwater and marine relatives over 100,000 years ago, long before the Baltic Sea was formed (8000 years ago). This markedly contrasts to most Baltic Sea multicellular organisms, which are locally adapted populations of freshwater or marine counterparts.ConclusionsWe describe the gene content, temporal dynamics and biogeography of a large set of new bacterioplankton genomes assembled from metagenomes. We propose that brackish environments exert such strong selection that lineages adapted to them flourish globally with limited influence from surrounding aquatic communities.Electronic supplementary materialThe online version of this article (doi:10.1186/s13059-015-0834-7) contains supplementary material, which is available to authorized users.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.