2017
DOI: 10.1371/journal.pone.0174386
|View full text |Cite
|
Sign up to set email alerts
|

An information-based network approach for protein classification

Abstract: Protein classification is one of the critical problems in bioinformatics. Early studies used geometric distances and polygenetic-tree to classify proteins. These methods use binary trees to present protein classification. In this paper, we propose a new protein classification method, whereby theories of information and networks are used to classify the multivariate relationships of proteins. In this study, protein universe is modeled as an undirected network, where proteins are classified according to their co… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
5
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
3
1
1

Relationship

1
4

Authors

Journals

citations
Cited by 6 publications
(5 citation statements)
references
References 32 publications
0
5
0
Order By: Relevance
“…Similar to CR, we can get a K×K nMIR matrix for each of the structural classes: where is the nMIR value between X i and X j ( i , j = 1,2,…, K ) [ 36 ], is the maximum entropy for all X i (i = 1,2,…,K), is the mutual information rate between X i and X j , and when i = j it degenerates to the Shannon Entropy of X i [ 50 ]: …”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Similar to CR, we can get a K×K nMIR matrix for each of the structural classes: where is the nMIR value between X i and X j ( i , j = 1,2,…, K ) [ 36 ], is the maximum entropy for all X i (i = 1,2,…,K), is the mutual information rate between X i and X j , and when i = j it degenerates to the Shannon Entropy of X i [ 50 ]: …”
Section: Methodsmentioning
confidence: 99%
“…The nMIR is a model-free measure that evaluates mutual relations no matter linear or not. Higher nMIR values may indicate stronger symmetric relations between the feature series [ 50 ].…”
Section: Methodsmentioning
confidence: 99%
“…Sequence space can be modelled through mutual information (Wan, Zhao & Yau, 2017 ). These values were used to construct a matrix of similarities between sequences, which were then converted to adjacency in a network.…”
Section: Networkmentioning
confidence: 99%
“…This data set contains 50 beta-globin protein sequences from 50 species studied in [46,[49][50][51][52][53], and the accession numbers have been shown in Additional file 1: Notes 1.2. After extracting features by the method DCGR and reducing the dimensionality using PCA, the Cosine distance was used to calculate the distance matrix of 50 beta-globin protein sequences, and the phylogenetic tree was also constructed in Fig.…”
Section: Similarity Analysis Of 50 Beta-globin Protein Sequencesmentioning
confidence: 99%
“…The phylogenetic trees of other methods [46,[49][50][51][52][53] including ClustalW have also been shown in Additional file 1: Figures S9-S15. After comparison, we found that ClustalW achieves very similar results with our method DCGR, while the other methods performs much worse since even the mammals and non-mammals cannot be correctly separated by the methods in [46,[49][50][51][52][53], and lots of proteins are erroneously clustered by the methods in [46,[51][52][53].…”
Section: Similarity Analysis Of 50 Beta-globin Protein Sequencesmentioning
confidence: 99%