2015
DOI: 10.1371/journal.pcbi.1004552
|View full text |Cite
|
Sign up to set email alerts
|

Gene Prioritization by Compressive Data Fusion and Chaining

Abstract: Data integration procedures combine heterogeneous data sets into predictive models, but they are limited to data explicitly related to the target object type, such as genes. Collage is a new data fusion approach to gene prioritization. It considers data sets of various association levels with the prediction task, utilizes collective matrix factorization to compress the data, and chaining to relate different object types contained in a data compendium. Collage prioritizes genes based on their similarity to seve… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
20
0

Year Published

2016
2016
2020
2020

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 22 publications
(20 citation statements)
references
References 43 publications
(50 reference statements)
0
20
0
Order By: Relevance
“…In particular, we compared Mashup to a recently proposed matrix factorization-based approach, CMF (Žitnik et al, 2015), which views heterogeneous data matrices as relations between different object types that can be approximated via a low-lank factorization. While straightforward CMF has limited use of network data as additional constraints on the parameters to be learned, we considered a favorably modified CMF that directly factorizes the network data (i.e., more similar to Mashup) and found that Mashup significantly outperforms this approach as well (Figure S1).…”
Section: Resultsmentioning
confidence: 99%
“…In particular, we compared Mashup to a recently proposed matrix factorization-based approach, CMF (Žitnik et al, 2015), which views heterogeneous data matrices as relations between different object types that can be approximated via a low-lank factorization. While straightforward CMF has limited use of network data as additional constraints on the parameters to be learned, we considered a favorably modified CMF that directly factorizes the network data (i.e., more similar to Mashup) and found that Mashup significantly outperforms this approach as well (Figure S1).…”
Section: Resultsmentioning
confidence: 99%
“…The overall goal is to identify these genes and, in a second step, experimentally validate these genes only. Many different computational methods that use different algorithms, datasets, and strategies have been developed [195,224,226,229,230,231,232,233,234]. Some of these approaches have been implemented as publicly available tools and several of these approaches have been experimentally validated [195,226,230,231,227].…”
Section: Protein Function Predictionmentioning
confidence: 99%
“…The question of distinguishing different semantics that exist within biomedical data systems remains largely unexplored. Two notable exceptions include a meta-path-based approach for gene–disease link prediction in heterogeneous networks ( Himmelstein and Baranzini, 2015 ) and a latent-chain-based approach for gene prioritization ( Zitnik et al , 2015 ). These approaches, however, are algorithmically different.…”
Section: Related Workmentioning
confidence: 99%
“…Challenges in the joint consideration of systems of datasets, such as that in Figure 1 , include inferring accurate models to predict disease traits and outcomes, elucidating important disease genes and generating insight into the genetic underpinnings of complex diseases ( Barabási et al , 2011 ; Han et al , 2013 ; Ruffalo et al , 2015 ; Taşan et al , 2015 ). We would like these models to collectively consider the breadth of available data, from whole-genome sequencing to transcriptomic, methylomic and metabolic data ( Navlakha and Kingsford, 2010 ; Greene et al , 2015 ; Zitnik et al , 2015 ). A major barrier preventing existing methods from fully exploiting entire data collections is that individual datasets usually cannot be directly related to each other.…”
Section: Introductionmentioning
confidence: 99%