Codon usage bias (CUB) results from the complex interplay between translational selection and mutational biases. Current methods for CUB analysis apply heuristics to integrate both components, limiting the depth and scope of CUB analysis as a technique to probe into the evolution and optimization of protein-coding genes. Here we introduce a self-consistent CUB index (scnRCA) that incorporates implicit correction for mutational biases, facilitating exploration of the translational selection component of CUB. We validate this technique using gene expression data and we apply it to a detailed analysis of CUB in the Pseudomonadales. Our results illustrate how the selective enrichment of specific codons among highly expressed genes is preserved in the context of genome-wide shifts in codon frequencies, and how the balance between mutational and translational biases leads to varying definitions of codon optimality. We extend this analysis to other moderate and fast growing bacteria and we provide unified support for the hypothesis that C- and A-ending codons of two-box amino acids, and the U-ending codons of four-box amino acids, are systematically enriched among highly expressed genes across bacteria. The use of an unbiased estimator of CUB allows us to report for the first time that the signature of translational selection is strongly conserved in the Pseudomonadales in spite of drastic changes in genome composition, and extends well beyond the core set of highly optimized genes in each genome. We generalize these results to other moderate and fast growing bacteria, hinting at selection for a universal pattern of gene expression that is conserved and detectable in conserved patterns of codon usage bias.
BackgroundMetagenomics enables the analysis of bacterial population composition and the study of emergent population features, such as shared metabolic pathways. Recently, we have shown that metagenomics datasets can be leveraged to characterize population-wide transcriptional regulatory networks, or meta-regulons, providing insights into how bacterial populations respond collectively to specific triggers. Here we formalize a Bayesian inference framework to analyze the composition of transcriptional regulatory networks in metagenomes by determining the probability of regulation of orthologous gene sequences. We assess the performance of this approach on synthetic datasets and we validate it by analyzing the copper-homeostasis network of Firmicutes species in the human gut microbiome.ResultsAssessment on synthetic datasets shows that our method provides a robust and interpretable metric for assessing putative regulation by a transcription factor on sets of promoter sequences mapping to an orthologous gene cluster. The inference framework integrates the regulatory contribution of secondary sites and can discern false positives arising from multiple instances of a clonal sequence. Posterior probabilities for orthologous gene clusters decline sharply when less than 20 % of mapped promoters have binding sites, but we introduce a sensitivity adjustment procedure to speed up computation that enhances regulation assessment in heterogeneous ortholog clusters. Analysis of the copper-homeostasis regulon governed by CsoR in the human gut microbiome Firmicutes reveals that CsoR controls itself and copper-translocating P-type ATPases, but not CopZ-type copper chaperones. Our analysis also indicates that CsoR frequently targets promoters with dual CsoR-binding sites, suggesting that it exploits higher-order binding conformations to fine-tune its activity.ConclusionsWe introduce and validate a method for the analysis of transcriptional regulatory networks from metagenomic data that enables inference of meta-regulons in a systematic and interpretable way. Validation of this method on the CsoR meta-regulon of gut microbiome Firmicutes illustrates the usefulness of the approach, revealing novel properties of the copper-homeostasis network in poorly characterized bacterial species and putting forward evidence of new mechanisms of DNA binding for this transcriptional regulator. Our approach will enable the comparative analysis of regulatory networks across metagenomes, yielding novel insights into the evolution of transcriptional regulatory networks.Electronic supplementary materialThe online version of this article (doi:10.1186/s13015-016-0082-8) contains supplementary material, which is available to authorized users.
Motivation: Data from metagenomics projects remain largely untapped for the analysis of transcriptional regulatory networks. Here, we provide proof-of-concept that metagenomic data can be effectively leveraged to analyze regulatory networks by characterizing the SOS meta-regulon in the human gut microbiome.Results: We combine well-established in silico and in vitro techniques to mine the human gut microbiome data and determine the relative composition of the SOS network in a natural setting. Our analysis highlights the importance of translesion synthesis as a primary function of the SOS response. We predict the association of this network with three novel protein clusters involved in cell wall biogenesis, chromosome partitioning and restriction modification, and we confirm binding of the SOS response transcriptional repressor to sites in the promoter of a cell wall biogenesis enzyme, a phage integrase and a death-on-curing protein. We discuss the implications of these findings and the potential for this approach for metagenome analysis.Contact: erill@umbc.eduSupplementary information: Supplementary data are available at Bioinformatics online.
Activation of CD4+ T cells requires the recognition of peptides that are presented by HLA class II molecules and can be assessed experimentally using the ELISpot assay. However, even given an individual’s HLA class II genotype, identifying which class II molecule is responsible for a positive ELISpot response to a given peptide is not trivial. The two main difficulties are the number of HLA class II molecules that can potentially be formed in a single individual (3–14) and the lack of clear peptide binding motifs for class II molecules. Here, we present a Bayesian framework to interpret ELISpot data (BIITE: Bayesian Immunogenicity Inference Tool for ELISpot); specifically BIITE identifies which HLA-II:peptide combination(s) are immunogenic based on cohort ELISpot data. We apply BIITE to two ELISpot datasets and explore the expected performance using simulations. We show this method can reach high accuracies, depending on the cohort size and the success rate of the ELISpot assay within the cohort.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.