In metagenome analysis, computational methods for assembly, taxonomic profiling and binning are key components facilitating downstream biological data interpretation. However, a lack of consensus about benchmarking datasets and evaluation metrics complicates proper performance assessment. The Critical Assessment of Metagenome Interpretation (CAMI) challenge has engaged the global developer community to benchmark their programs on datasets of unprecedented complexity and realism. Benchmark metagenomes were generated from ~700 newly sequenced microorganisms and ~600 novel viruses and plasmids, including genomes with varying degrees of relatedness to each other and to publicly available ones and representing common experimental setups. Across all datasets, assembly and genome binning programs performed well for species represented by individual genomes, while performance was substantially affected by the presence of related strains. Taxonomic profiling and binning programs were proficient at high taxonomic ranks, with a notable performance decrease below the family level. Parameter settings substantially impacted performances, underscoring the importance of program reproducibility. While highlighting current challenges in computational metagenomics, the CAMI results provide a roadmap for software selection to answer specific research questions.
Software is now both central and essential to modern biology, yet lack of availability, difficult installations, and complex user interfaces make software hard to obtain and use. Containerisation, as exemplified by the Docker platform, has the potential to solve the problems associated with sharing software. We propose bioboxes: containers with standardised interfaces to make bioinformatics software interchangeable.
Every protein has a biosynthetic cost to the cell based on the synthesis of its constituent amino acids. In order to optimise growth and reproduction, natural selection is expected, where possible, to favour the use of proteins whose constituents are cheaper to produce, as reduced biosynthetic cost may confer a fitness advantage to the organism. Quantifying the cost of amino acid biosynthesis presents challenges, since energetic requirements may change across different cellular and environmental conditions. We developed a systems biology approach to estimate the cost of amino acid synthesis based on genome-scale metabolic models and investigated the effects of the cost of amino acid synthesis on Saccharomyces cerevisiae gene expression and protein evolution. First, we used our two new and six previously reported measures of amino acid cost in conjunction with codon usage bias, tRNA gene number and atomic composition to identify which of these factors best predict transcript and protein levels. Second, we compared amino acid cost with rates of amino acid substitution across four species in the genus Saccharomyces. Regardless of which cost measure is used, amino acid biosynthetic cost is weakly associated with transcript and protein levels. In contrast, we find that biosynthetic cost and amino acid substitution rates show a negative correlation, but for only a subset of cost measures. In the economy of the yeast cell, we find that the cost of amino acid synthesis plays a limited role in shaping transcript and protein expression levels compared to that of translational optimisation. Biosynthetic cost does, however, appear to affect rates of amino acid evolution in Saccharomyces, suggesting that expensive amino acids may only be used when they have specific structural or functional roles in protein sequences. However, as there appears to be no single currency to compute the cost of amino acid synthesis across all cellular and environmental conditions, we conclude that a systems approach is necessary to unravel the full effects of amino acid biosynthetic cost in complex biological systems.
In metagenome analysis, computational methods for assembly, taxonomic profiling and binning are key components facilitating downstream biological data interpretation. However, a lack of consensus about benchmarking datasets and evaluation metrics complicates proper performance assessment. The Critical Assessment of Metagenome Interpretation (CAMI) challenge has engaged the global developer community to benchmark their programs on datasets of unprecedented complexity and realism. Benchmark metagenomes were generated from ~700 newly sequenced microorganisms and ~600 novel viruses and plasmids, including genomes with varying degrees of relatedness to each other and to publicly available ones and representing common experimental setups. Across all datasets, assembly and genome binning programs performed well for species represented by individual genomes, while performance was substantially affected by the presence of related strains. Taxonomic profiling and binning programs were proficient at high taxonomic ranks, with a notable performance decrease below the family level. Parameter settings substantially impacted performances, underscoring the importance of program reproducibility. While highlighting current challenges in computational metagenomics, the CAMI results provide a roadmap for software selection to answer specific research questions.. CC-BY 4.0 International license peer-reviewed) is the author/funder. It is made available under a
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.