2022
DOI: 10.26599/tst.2021.9010064
|View full text |Cite
|
Sign up to set email alerts
|

Protein Residue Contact Prediction Based on Deep Learning and Massive Statistical Features from Multi-Sequence Alignment

Abstract: Sequence-based protein tertiary structure prediction is of fundamental importance because the function of a protein ultimately depends on its 3D structure. An accurate residue-residue contact map is one of the essential elements for current ab initio prediction protocols of 3D structure prediction. Recently, with the combination of deep learning and direct coupling techniques, the performance of residue contact prediction has achieved significant progress. However, a considerable number of current Deep-Learnin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
6
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
8

Relationship

1
7

Authors

Journals

citations
Cited by 9 publications
(6 citation statements)
references
References 50 publications
0
6
0
Order By: Relevance
“…The first feature set contains 526 feature channels: one-hot-encoder of the target sequence (1D features, 20*2 channels); position-specific frequency matrix (1D features, 21*2 channels, considering gap) and positional entropy ( Yang et al, 2020 ) (1D features, 1*2 channels); and coupling features ( Yang et al, 2020 ) (2D features, 441 channels) derived from the inverse of the shrunk covariance matrix of MSA. The second feature set contains 151 feature channels: one-hot-encoder of the target sequence (1D features, 20*2 channels), position-specific scoring matrix ( Altschul et al, 1997 ) (1D features; 20*2 channels; not considering gap), HMM profile ( Remmert et al, 2012 ) (1D features, 30*2 channels), secondary structure from SPOT-1D (Hanson et al, 2019) (1D features, 3*2 channels), solvent accessible surface area from SPOT-1D ( Hanson et al, 2019 ) (1D features, 1*2 channels), CCMPRED score (Seemayer et al, 2014) (2D features, 1 channel), mutual information ( Zhang et al, 2022 ) (2D feature, 1 channel), and statistical pair-wise contact potential ( Betancourt and Thirumalai, 1999 ) (2D feature, 1 channel). The first feature set, indicated as FeatSet1, is mainly composed of 2D direct coupling features (441 out of 526 total features) from the MSA, while the second feature set, indicated as FeatSet2, is mainly composed of 1D sequence-based features (148 out of 151 total features).…”
Section: Methodsmentioning
confidence: 99%
“…The first feature set contains 526 feature channels: one-hot-encoder of the target sequence (1D features, 20*2 channels); position-specific frequency matrix (1D features, 21*2 channels, considering gap) and positional entropy ( Yang et al, 2020 ) (1D features, 1*2 channels); and coupling features ( Yang et al, 2020 ) (2D features, 441 channels) derived from the inverse of the shrunk covariance matrix of MSA. The second feature set contains 151 feature channels: one-hot-encoder of the target sequence (1D features, 20*2 channels), position-specific scoring matrix ( Altschul et al, 1997 ) (1D features; 20*2 channels; not considering gap), HMM profile ( Remmert et al, 2012 ) (1D features, 30*2 channels), secondary structure from SPOT-1D (Hanson et al, 2019) (1D features, 3*2 channels), solvent accessible surface area from SPOT-1D ( Hanson et al, 2019 ) (1D features, 1*2 channels), CCMPRED score (Seemayer et al, 2014) (2D features, 1 channel), mutual information ( Zhang et al, 2022 ) (2D feature, 1 channel), and statistical pair-wise contact potential ( Betancourt and Thirumalai, 1999 ) (2D feature, 1 channel). The first feature set, indicated as FeatSet1, is mainly composed of 2D direct coupling features (441 out of 526 total features) from the MSA, while the second feature set, indicated as FeatSet2, is mainly composed of 1D sequence-based features (148 out of 151 total features).…”
Section: Methodsmentioning
confidence: 99%
“…And the robustness of the proposed algorithm to different disturbances acting on the ship is proved by simulation studies, and the obtained performance is comparable to the state-of-the-art methods based on template matching [ 1 ]. Another team of scholars has developed a protein residue contact prediction system based on deep learning and massive statistical features of multiple sequence alignments [ 2 ]. Ojugo created a predictive and intelligent decision support model for the diabetes pandemic using deep reinforcement learning algorithms [ 3 ].…”
Section: Related Workmentioning
confidence: 99%
“…From the birth of machine learning to the present, according to the hierarchical structure of the model, its development process has gone through two stages: shallow learning and deep learning. In general, these models are considered nonlinear, only nonlinear transformers [12,13]. Deep learning is a model of a deep neural network with many layers of mystery.…”
Section: Deep Learningmentioning
confidence: 99%