2017
DOI: 10.7717/peerj.3712
|View full text |Cite
|
Sign up to set email alerts
|

PrePhyloPro: phylogenetic profile-based prediction of whole proteome linkages

Abstract: Direct and indirect functional links between proteins as well as their interactions as part of larger protein complexes or common signaling pathways may be predicted by analyzing the correlation of their evolutionary patterns. Based on phylogenetic profiling, here we present a highly scalable and time-efficient computational framework for predicting linkages within the whole human proteome. We have validated this method through analysis of 3,697 human pathways and molecular complexes and a comparison of our re… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
17
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 16 publications
(17 citation statements)
references
References 74 publications
0
17
0
Order By: Relevance
“…Historically, distance metrics between profiles have fallen into two categories: tree-based and vector-based metrics [6,17] . Comparing all-vs-all profiles to define a distance matrix using metrics detailed in other phylogenetic profiling approaches, such as mutual information, Hamming distance or tree-aware methods [6,18,[62][63][64] , scales quadratically with the number of profiles. The time it takes to calculate profiles and a distance between two profiles typically scales poorly with the number of genomes considered, especially with tree-based methods.…”
Section: Profile Construction With Weighted Minhashing and Database Cmentioning
confidence: 99%
See 2 more Smart Citations
“…Historically, distance metrics between profiles have fallen into two categories: tree-based and vector-based metrics [6,17] . Comparing all-vs-all profiles to define a distance matrix using metrics detailed in other phylogenetic profiling approaches, such as mutual information, Hamming distance or tree-aware methods [6,18,[62][63][64] , scales quadratically with the number of profiles. The time it takes to calculate profiles and a distance between two profiles typically scales poorly with the number of genomes considered, especially with tree-based methods.…”
Section: Profile Construction With Weighted Minhashing and Database Cmentioning
confidence: 99%
“…Several studies have established the Jaccard similarity [65] between two profiles of presence and absence patterns across extant genomes as a valid phylogenetic profiling distance metric, which is able to capture an evolutionary signal closely related to shared protein functions [18,66,67] . This Minhashing techniques were devised to measure the similarity of documents and search for similar documents within large datasets containing billions of elements [69][70][71] .…”
Section: Profile Construction With Weighted Minhashing and Database Cmentioning
confidence: 99%
See 1 more Smart Citation
“…To construct profiles representing groups of homologues, some pipelines resort to all-vs-all sequence similarity searches to derive orthologous groups and only count binary presence or absence of a member of each group in a limited number of genomes [ 19 , 20 ] or forgo this step altogether and ignore the evolutionary history of each protein family, relying instead on co-occurrence in extant genomes [ 21 ]. Other tree-based methods infer the underlying evolutionary history from the presence of extant homologues [ 22 ].…”
Section: Introductionmentioning
confidence: 99%
“…Prominent examples are co-expression networks ( Willsey et al , 2013 ), co-dependency networks (e.g. AchillesNet; Li et al , 2018 ), co-evolution networks ( Niu et al , 2017 ), metabolic pathways ( Kanehisa et al , 2017 ) or protein–protein interaction (PPI) networks ( Lage et al , 2007 ; Li et al , 2017 ; Szklarczyk et al , 2019 ). Especially, PPI networks constitute an interesting representation of gene interactions, as they commonly combine information from different data sources, tissues, and molecular processes at different scales.…”
Section: Introductionmentioning
confidence: 99%