In metagenome analysis, computational methods for assembly, taxonomic profiling and binning are key components facilitating downstream biological data interpretation. However, a lack of consensus about benchmarking datasets and evaluation metrics complicates proper performance assessment. The Critical Assessment of Metagenome Interpretation (CAMI) challenge has engaged the global developer community to benchmark their programs on datasets of unprecedented complexity and realism. Benchmark metagenomes were generated from ~700 newly sequenced microorganisms and ~600 novel viruses and plasmids, including genomes with varying degrees of relatedness to each other and to publicly available ones and representing common experimental setups. Across all datasets, assembly and genome binning programs performed well for species represented by individual genomes, while performance was substantially affected by the presence of related strains. Taxonomic profiling and binning programs were proficient at high taxonomic ranks, with a notable performance decrease below the family level. Parameter settings substantially impacted performances, underscoring the importance of program reproducibility. While highlighting current challenges in computational metagenomics, the CAMI results provide a roadmap for software selection to answer specific research questions.
Pseudomonas is a large and diverse genus of Gammaproteobacteria. To provide a framework for discovery of evolutionary and taxonomic relationships of these bacteria, we compared the genomes of type strains of 163 species and 3 additional subspecies of Pseudomonas, including 118 genomes sequenced herein. A maximum likelihood phylogeny of the 166 type strains based on protein sequences of 100 single-copy orthologous genes revealed thirteen groups of Pseudomonas, composed of two to sixty three species each. Pairwise average nucleotide identities and alignment fractions were calculated for the data set of the 166 type strains and 1224 genomes of Pseudomonas available in public databases. Results revealed that 394 of the 1224 genomes were distinct from any type strain, suggesting that the type strains represent only a fraction of the genomic diversity of the genus. The core genome of Pseudomonas was determined to contain 794 genes conferring primarily housekeeping functions. The results of this study provide a phylogenetic framework for future studies aiming to resolve the classification and phylogenetic relationships, identify new gene functions and phenotypes, and explore the ecological and metabolic potential of the Pseudomonas spp.
Marine algae convert a substantial fraction of fixed carbon dioxide into various polysaccharides. Flavobacteriia that are specialized on algal polysaccharide degradation feature genomic clusters termed polysaccharide utilization loci (PULs). As knowledge on extant PUL diversity is sparse, we sequenced the genomes of 53 North Sea Flavobacteriia and obtained 400 PULs. Bioinformatic PUL annotations suggest usage of a large array of polysaccharides, including laminarin, α-glucans, and alginate as well as mannose-, fucose-, and xylose-rich substrates. Many of the PULs exhibit new genetic architectures and suggest substrates rarely described for marine environments. The isolates' PUL repertoires often differed considerably within genera, corroborating ecological niche-associated glycan partitioning. Polysaccharide uptake in Flavobacteriia is mediated by SusCD-like transporter complexes. Respective protein trees revealed clustering according to polysaccharide specificities predicted by PUL annotations. Using the trees, we analyzed expression of SusC/D homologs in multiyear phytoplankton bloom-associated metaproteomes and found indications for profound changes in microbial utilization of laminarin, α-glucans, β-mannan, and sulfated xylan. We hence suggest the suitability of SusC/D-like transporter protein expression within heterotrophic bacteria as a proxy for the temporal utilization of discrete polysaccharides.
The National Institutes of Health's Mammalian Gene Collection (MGC) project was designed to generate and sequence a publicly accessible cDNA resource containing a complete open reading frame (ORF) for every human and mouse gene. The project initially used a random strategy to select clones from a large number of cDNA libraries from diverse tissues. Candidate clones were chosen based on 5'-EST sequences, and then fully sequenced to high accuracy and analyzed by algorithms developed for this project. Currently, more than 11,000 human and 10,000 mouse genes are represented in MGC by at least one clone with a full ORF. The random selection approach is now reaching a saturation point, and a transition to protocols targeted at the missing transcripts is now required to complete the mouse and human collections. Comparison of the sequence of the MGC clones to reference genome sequences reveals that most cDNA clones are of very high sequence quality, although it is likely that some cDNAs may carry missense variants as a consequence of experimental artifact, such as PCR, cloning, or reverse transcriptase errors. Recently, a rat cDNA component was added to the project, and ongoing frog (Xenopus) and zebrafish (Danio) cDNA projects were expanded to take advantage of the high-throughput MGC pipeline.
Burkholderia sensu lato is a large and complex group, containing pathogenic, phytopathogenic, symbiotic and non-symbiotic strains from a very wide range of environmental (soil, water, plants, fungi) and clinical (animal, human) habitats. Its taxonomy has been evaluated several times through the analysis of 16S rRNA sequences, concantenated 4–7 housekeeping gene sequences, and lately by genome sequences. Currently, the division of this group into Burkholderia, Caballeronia, Paraburkholderia, and Robbsia is strongly supported by genome analysis. These new genera broadly correspond to the various habitats/lifestyles of Burkholderia s.l., e.g., all the plant beneficial and environmental (PBE) strains are included in Paraburkholderia (which also includes all the N2-fixing legume symbionts) and Caballeronia, while most of the human and animal pathogens are retained in Burkholderia sensu stricto. However, none of these genera can accommodate two important groups of species. One of these includes the closely related Paraburkholderia rhizoxinica and Paraburkholderia endofungorum, which are both symbionts of the fungal phytopathogen Rhizopus microsporus. The second group comprises the Mimosa-nodulating bacterium Paraburkholderia symbiotica, the phytopathogen Paraburkholderia caryophylli, and the soil bacteria Burkholderia dabaoshanensis and Paraburkholderia soli. In order to clarify their positions within Burkholderia sensu lato, a phylogenomic approach based on a maximum likelihood analysis of conserved genes from more than 100 Burkholderia sensu lato species was carried out. Additionally, the average nucleotide identity (ANI) and amino acid identity (AAI) were calculated. The data strongly supported the existence of two distinct and unique clades, which in fact sustain the description of two novel genera Mycetohabitans gen. nov. and Trinickia gen. nov. The newly proposed combinations are Mycetohabitans endofungorum comb. nov., Mycetohabitans rhizoxinica comb. nov., Trinickia caryophylli comb. nov., Trinickia dabaoshanensis comb. nov., Trinickia soli comb. nov., and Trinickia symbiotica comb. nov. Given that the division between the genera that comprise Burkholderia s.l. in terms of their lifestyles is often complex, differential characteristics of the genomes of these new combinations were investigated. In addition, two important lifestyle-determining traits—diazotrophy and/or symbiotic nodulation, and pathogenesis—were analyzed in depth i.e., the phylogenetic positions of nitrogen fixation and nodulation genes in Trinickia via-à-vis other Burkholderiaceae were determined, and the possibility of pathogenesis in Mycetohabitans and Trinickia was tested by performing infection experiments on plants and the nematode Caenorhabditis elegans. It is concluded that (1) T. symbiotica nif and nod genes fit within the wider Mimosa-nodulating Burkholderiaceae but appear in separate clades and that T. caryophylli nif genes are basal to the free-living Burkholderia s.l. strains, while with regard to pathogenesis (2) none of the Mycetoh...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.