Evaluation and improvement of the regulatory inference for large co-expression networks with limited sample size

Guo, Wenbin; Calixto, Cristiane P. G.; Τzioutziou, Nikoleta A.; Lin, Ping; Waugh, Robbie; Brown, John W.; Zhang, Runxuan

doi:10.1186/s12918-017-0440-2

Cited by 16 publications

(11 citation statements)

References 64 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Thus far, we have focused on this application using a set of known transcription factor-target pairs to assess performance, as done in DREAM5. However, network inference methods can also be used to determine functional overlap from transcriptional (or other) data across many conditions [ 4 , 17 , 21 , 22 ]. Thus, we wanted to assess the ability of bootstrapped inference methods to infer edges between genes in the same pathways.…”

Section: Resultsmentioning

confidence: 99%

“…Guo et al 2017 employed use of partial correlations (i.e. isolation of a single gene pair at a time), extracting only the most highly correlated relationships as edges in their RLowPC (Relevance Low order Partial Correlation) method [ 17 ]. Friedman et al 1999 applied bootstrapping to yield a successful result, but by resampling genes, not conditions, and applying to small, synthetic datasets [ 14 ].…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Improving network inference algorithms using resampling methods

et al. 2018

View full text Add to dashboard Cite

BackgroundRelatively small changes to gene expression data dramatically affect co-expression networks inferred from that data which, in turn, can significantly alter the subsequent biological interpretation. This error propagation is an underappreciated problem that, while hinted at in the literature, has not yet been thoroughly explored. Resampling methods (e.g. bootstrap aggregation, random subspace method) are hypothesized to alleviate variability in network inference methods by minimizing outlier effects and distilling persistent associations in the data. But the efficacy of the approach assumes the generalization from statistical theory holds true in biological network inference applications.ResultsWe evaluated the effect of bootstrap aggregation on inferred networks using commonly applied network inference methods in terms of stability, or resilience to perturbations in the underlying expression data, a metric for accuracy, and functional enrichment of edge interactions.ConclusionBootstrap aggregation results in improved stability and, depending on the size of the input dataset, a marginal improvement to accuracy assessed by each method’s ability to link genes in the same functional pathway.Electronic supplementary materialThe online version of this article (10.1186/s12859-018-2402-0) contains supplementary material, which is available to authorized users.

show abstract

Section: Resultsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Improving network inference algorithms using resampling methods

et al. 2018

View full text Add to dashboard Cite

show abstract

“…One reasons for this poor performance is that in DREAM4 networks, there are gene clusters with highly cohesive expression patterns. All pairs of such a cluster have high correlations between them that can result in a large number of indirect edges in the learned networks [75]. Another reason is that the DREAM4 networks are even sparser than the AR(1) models, the five 100 gene networks have an average of 2.31% arcs in the network.…”

Section: Discussionmentioning

confidence: 99%

“…For evaluation, we use the same metrics as mentioned in Section 3.5.1. However, following [75], we evaluated learned networks based on undirected network structures. This is because our methods do not incorporate perturbation information as prior knowledge.…”

Section: Simulation Study 2 -Dream4 Networkmentioning

confidence: 99%

Dynamic Bayesian Network Learning to Infer Sparse Models From Time Series Gene Expression Data

Ajmal

Madden

2022

IEEE/ACM Trans. Comput. Biol. and Bioinf.

View full text Add to dashboard Cite

One of the key challenges in systems biology is to derive gene regulatory networks (GRNs) from complex high-dimensional sparse data. Bayesian networks (BNs) and dynamic Bayesian networks (DBNs) have been widely applied to infer GRNs from gene expression data. GRNs are typically sparse but traditional approaches of BN structure learning to elucidate GRNs often produce many spurious (false positive) edges. We present two new BN scoring functions, which are extensions to the Bayesian Information Criterion (BIC) score, with additional penalty terms and use them in conjunction with DBN structure search methods to find a graph structure that maximises the proposed scores. Our BN scoring functions offer better solutions for inferring networks with fewer spurious edges compared to the BIC score. The proposed methods are evaluated extensively on auto regressive and DREAM4 benchmarks. We found that they significantly improve the precision of the learned graphs, relative to the BIC score. The proposed methods are also evaluated on three real time series gene expression datasets. The results demonstrate that our algorithms are able to learn sparse graphs from high-dimensional time series data. The implementation of these algorithms is open source and is available in form of an R package on GitHub at https://github.com/HamdaBinteAjmal/DBN4GRN, along with the documentation and tutorials.

show abstract

“…Most studies that do use methylation data estimate networks by directly correlating all CpG site pairs, with a focus on module detection [2][3][4][5][6]. However, the typical small sample-tovariable ratio limits the accuracy of the resulting networks [7]. Also, interpreting methylation networks is more difficult, since less is known about the functional role and gene targets of non-coding regulatory regions.…”

Section: Introductionmentioning

confidence: 99%

DNA Methylation Network Estimation with Sparse Latent Gaussian Graphical Model

Jafarzadeh

Cole

et al. 2018

Preprint

View full text Add to dashboard Cite

Inferring molecular interaction networks from genomics data is important for advancing our understanding of biological processes. Whereas considerable research effort has been placed on inferring such networks from gene expression data, network estimation from DNA methylation data has received very little attention due to the substantially higher dimensionality and complications with result interpretation for non-genic regions. To combat these challenges, we propose here an approach based on sparse latent Gaussian graphical model (SLGGM). The core idea is to perform network estimation on q latent variables as opposed to d CpG sites, with q<

show abstract

Evaluation and improvement of the regulatory inference for large co-expression networks with limited sample size

Cited by 16 publications

References 64 publications

Improving network inference algorithms using resampling methods

Improving network inference algorithms using resampling methods

Dynamic Bayesian Network Learning to Infer Sparse Models From Time Series Gene Expression Data

DNA Methylation Network Estimation with Sparse Latent Gaussian Graphical Model

Contact Info

Product

Resources

About