2014
DOI: 10.1039/c3mb70486f
|View full text |Cite
|
Sign up to set email alerts
|

Ensemble learning prediction of protein–protein interactions using proteins functional annotations

Abstract: Protein-protein interactions are important for the majority of biological processes. A significant number of computational methods have been developed to predict protein-protein interactions using protein sequence, structural and genomic data. Vast experimental data is publicly available on the Internet, but it is scattered across numerous databases. This fact motivated us to create and evaluate new high-throughput datasets of interacting proteins. We extracted interaction data from DIP, MINT, BioGRID and IntA… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
28
0

Year Published

2014
2014
2021
2021

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 48 publications
(28 citation statements)
references
References 48 publications
0
28
0
Order By: Relevance
“…We used the Yeast and Human protein interaction datasets derived by Indrajit Saha et al [27]. The dataset was composed of 3 datasets extracted from the DIP, MINT, BioGrid, and IntAct databases.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…We used the Yeast and Human protein interaction datasets derived by Indrajit Saha et al [27]. The dataset was composed of 3 datasets extracted from the DIP, MINT, BioGrid, and IntAct databases.…”
Section: Methodsmentioning
confidence: 99%
“…These computational methods can be roughly divided into sequence based [11,12,13,14,15,16,17,18,19], structure based [20,21,22,23,24], and function annotation based [25,26,27,28,29] methods with different coding methods, such as autocovariance (AC) [12], local descriptors (LD) [19], conjoint triad (CT) [11], Geary autocorrelation (GAC) [30], Moran autocorrelation (MAC) [31], and normalized Moreau–Broto autocorrelation (NMBAC) [32]. Sequence-based methods have the advantage of not requiring expensive and time-consuming processes to determine protein structures.…”
Section: Introductionmentioning
confidence: 99%
“…The construction of high quality negative examples is very difficult. Common methods for generating negatives include drawing random pairs of biomolecules from all known proteins found in a specific organism (Saha et al, 2014), or only from the selected subset of the whole proteome, namely from the proteins occurring in positive examples (Chang et al, 2010). We strongly believe that such methods have their inherent drawbacks, because they ignore network properties of the underlying protein interactome.…”
Section: Protein Level Positives and Negativesmentioning
confidence: 99%
“…State of the art computational methods for the prediction of PPI combine information from different sources and have presented adequate classification performance. Recent approaches (Zhang et al, 2012 ; Saha et al, 2014 ; Theofilatos et al, 2014 ) have attempted to overcome the bottlenecks in this PPIs prediction, namely the definition of negative datasets, the feature selection, the class imbalance, the tradeoff between classification performance and interpretability, missing features values and the calculation of a confidence score for every PPI. The advancements on the computational prediction and scoring of PPI enabled the construction of binary PPI networks with increased coverage on the full interactome.…”
Section: State-of-the-art and Recent Advancements Of The Computationamentioning
confidence: 99%