BackgroundInferring gene networks from high-throughput data constitutes an important step in the discovery of relevant regulatory relationships in organism cells. Despite the large number of available Gene Regulatory Network inference methods, the problem remains challenging: the underdetermination in the space of possible solutions requires additional constraints that incorporate a priori information on gene interactions.MethodsWeighting all possible pairwise gene relationships by a probability of edge presence, we formulate the regulatory network inference as a discrete variational problem on graphs. We enforce biologically plausible coupling between groups and types of genes by minimizing an edge labeling functional coding for a priori structures. The optimization is carried out with Graph cuts, an approach popular in image processing and computer vision. We compare the inferred regulatory networks to results achieved by the mutual-information-based Context Likelihood of Relatedness (CLR) method and by the state-of-the-art GENIE3, winner of the DREAM4 multifactorial challenge.ResultsOur BRANE Cut approach infers more accurately the five DREAM4 in silico networks (with improvements from 6 % to 11 %). On a real Escherichia coli compendium, an improvement of 11.8 % compared to CLR and 3 % compared to GENIE3 is obtained in terms of Area Under Precision-Recall curve. Up to 48 additional verified interactions are obtained over GENIE3 for a given precision. On this dataset involving 4345 genes, our method achieves a performance similar to that of GENIE3, while being more than seven times faster. The BRANE Cut code is available at: http://www-syscom.univ-mlv.fr/~pirayre/Codes-GRN-BRANE-cut.html.ConclusionsBRANE Cut is a weighted graph thresholding method. Using biologically sound penalties and data-driven parameters, it improves three state-of-the art GRN inference methods. It is applicable as a generic network inference post-processing, due to its computational efficiency.Electronic supplementary materialThe online version of this article (doi:10.1186/s12859-015-0754-2) contains supplementary material, which is available to authorized users.
BackgroundThe filamentous fungus Trichoderma reesei is the main industrial cellulolytic enzyme producer. Several strains have been developed in the past using random mutagenesis, and despite impressive performance enhancements, the pressure for low-cost cellulases has stimulated continuous research in the field. In this context, comparative study of the lower and higher producer strains obtained through random mutagenesis using systems biology tools (genome and transcriptome sequencing) can shed light on the mechanisms of cellulase production and help identify genes linked to performance. Previously, our group published comparative genome sequencing of the lower and higher producer strains NG 14 and RUT C30. In this follow-up work, we examine how these mutations affect phenotype as regards the transcriptome and cultivation behaviour.ResultsWe performed kinetic transcriptome analysis of the NG 14 and RUT C30 strains of early enzyme production induced by lactose using bioreactor cultivations close to an industrial cultivation regime. RUT C30 exhibited both earlier onset of protein production (3 h) and higher steady-state productivity. A rather small number of genes compared to previous studies were regulated (568), most of them being specific to the NG 14 strain (319). Clustering analysis highlighted similar behaviour for some functional categories and allowed us to distinguish between induction-related genes and productivity-related genes. Cross-comparison of our transcriptome data with previously identified mutations revealed that most genes from our dataset have not been mutated. Interestingly, the few mutated genes belong to the same clusters, suggesting that these clusters contain genes playing a role in strain performance.ConclusionsThis is the first kinetic analysis of a transcriptomic study carried out under conditions approaching industrial ones with two related strains of T. reesei showing distinctive cultivation behaviour. Our study sheds some light on some of the events occurring in these strains following induction by lactose. The fact that few regulated genes have been affected by mutagenesis suggests that the induction mechanism is essentially intact compared to that for the wild-type isolate QM6a and might be engineered for further improvement of T. reesei. Genes from two specific clusters might be potential targets for such genetic engineering.Electronic supplementary materialThe online version of this article (doi:10.1186/s13068-014-0173-z) contains supplementary material, which is available to authorized users.
Background: Inferring gene networks from high-throughput data constitutes an important step in the discovery of relevant regulatory relationships in organism cells. Despite the large number of available Gene Regulatory Network inference methods, the problem remains challenging: the underdetermination in the space of possible solutions requires additional constraints that incorporate a priori information on gene interactions.
Abstract-Discovering meaningful gene interactions is crucial for the identification of novel regulatory processes in cells. Building accurately the related graphs remains challenging due to the large number of possible solutions from available data. Nonetheless, enforcing a priori on the graph structure, such as modularity, may reduce network indeterminacy issues. BRANE Clust (Biologically-Related A priori Network Enhancement with Clustering) refines gene regulatory network (GRN) inference thanks to cluster information. It works as a post-processing tool for inference methods (i.e. CLR, GENIE3). In BRANE Clust, the clustering is based on the inversion of a system of linear equations involving a graph-Laplacian matrix promoting a modular structure. Our approach is validated on DREAM4 and DREAM5 datasets with objective measures, showing significant comparative improvements. We provide additional insights on the discovery of novel regulatory or co-expressed links in the inferred Escherichia coli network evaluated using the STRING database. The comparative pertinence of clustering is discussed computationally (SIMoNe, WGCNA, X-means) and biologically (RegulonDB). BRANE Clust software is available at:
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.