2018
DOI: 10.1101/439620
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Benchmarking network propagation methods for disease gene identification

Abstract: BackgroundIn-silico identification of potential disease genes has become an essential aspect of drug target discovery. Recent studies suggest that one powerful way to identify successful targets is through the use of genetic and genomic information. Given a known disease gene, leveraging intermolecular connections via networks and pathways seems a natural way to identify other genes and proteins that are involved in similar biological processes, and that can therefore be analysed as additional targets.ResultsH… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
5
0

Year Published

2019
2019
2020
2020

Publication Types

Select...
2
2

Relationship

3
1

Authors

Journals

citations
Cited by 4 publications
(6 citation statements)
references
References 52 publications
1
5
0
Order By: Relevance
“…What is more revealing is that all the proxy gene sets identified in these networks have an average gene score higher than a random distribution and that the size of this effect largely tracks the enrichments observed above: unsurprisingly the effect is larger in the more advanced methods such as Pascal that use the genetic signal directly (Figure 4; right) but is also demonstrated for naïve methods (Figure 4; left, and supplementary Figure 4), and is true even based on very different underlying network structures (supplementary Figure 5). This observation is consistent with previous work that has shown that genes with nominally significant Pascal scores from a GWAS can be used with network information to predict genetic associations subsequently found in independent genetic studies for the same trait 14 .…”
Section: Resultssupporting
confidence: 92%
See 2 more Smart Citations
“…What is more revealing is that all the proxy gene sets identified in these networks have an average gene score higher than a random distribution and that the size of this effect largely tracks the enrichments observed above: unsurprisingly the effect is larger in the more advanced methods such as Pascal that use the genetic signal directly (Figure 4; right) but is also demonstrated for naïve methods (Figure 4; left, and supplementary Figure 4), and is true even based on very different underlying network structures (supplementary Figure 5). This observation is consistent with previous work that has shown that genes with nominally significant Pascal scores from a GWAS can be used with network information to predict genetic associations subsequently found in independent genetic studies for the same trait 14 .…”
Section: Resultssupporting
confidence: 92%
“…A weakness of our study is that we do not test other network propagation methods. However, many such methods are based around some version of the random walk with restart algorithm or a mathematically equivalent conception and in previous work we have showed that many such algorithms perform equivalently on a highly related problem 14 . One potential avenue for development in this area would be in graph based deep learning that could explicitly model other additional sources of disease association such as those from target information integration platforms such as Open Targets 24 .…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Explanatory models have found use in the formal description of differences in performance as a function of design factors (Lopez-del Rio et al , 2019; Picart-Armada et al , 2019). Following (Picart-Armada et al , 2019), the trends in AUROC and AUPRC were described through logistic-like quasibinomial models with a logit link function, as a generalisation of logistic models to prevent over and under-dispersion issues.…”
Section: Methodsmentioning
confidence: 99%
“…Most published prioritization algorithms contain a validation component, but each study takes its own approach to do this, making comparison between algorithms difficult. A common benchmarking approach is to use ''gold standard'' genes (i.e., genes with a known link to the trait of interest) 9,19,20 to calculate a receiver operating characteristic or similar metric. Unfortunately, this strategy relies heavily on prior knowledge of disease etiology and is biased toward wellstudied genes in well-characterized biological pathways.…”
Section: Introductionmentioning
confidence: 99%