Enhancing protein inter-residue real distance prediction by scrutinising deep learning models

Rahman, Julia; Newton, M. A. Hakim; Islam, Khaled Ben; Sattar, Abdul

doi:10.1038/s41598-021-04441-y

Cited by 8 publications

(7 citation statements)

References 62 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The seven physicochemical properties [ 14 , 29 ] for each amino acid residue are steric parameter (graph shape index), hydrophobicity, volume, polarisability, isoelectric point, helix probability, and sheet probability. When extracting these three features for protein residues, we focused exclusively on the 20 standard amino acid residues.…”

Section: Methodsmentioning

confidence: 99%

“…Inspired by the use of distance measures in protein structure prediction [ 14 , 28 , 29 ], in this work, we employ distance-based input features in protein-ligand binding affinity prediction. To be more specific, we use distances between donor-acceptor [ 30 ], hydrophobic [ 31 , 32 ], and -stacking [ 31 , 32 ] atoms as interactions between such atoms play crucial roles in protein-ligand binding.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Distance plus attention for binding affinity prediction

Rahman,

Newton,

Ali

et al. 2024

J Cheminform

Self Cite

View full text Add to dashboard Cite

Protein-ligand binding affinity plays a pivotal role in drug development, particularly in identifying potential ligands for target disease-related proteins. Accurate affinity predictions can significantly reduce both the time and cost involved in drug development. However, highly precise affinity prediction remains a research challenge. A key to improve affinity prediction is to capture interactions between proteins and ligands effectively. Existing deep-learning-based computational approaches use 3D grids, 4D tensors, molecular graphs, or proximity-based adjacency matrices, which are either resource-intensive or do not directly represent potential interactions. In this paper, we propose atomic-level distance features and attention mechanisms to capture better specific protein-ligand interactions based on donor-acceptor relations, hydrophobicity, and $$\pi $$ π -stacking atoms. We argue that distances encompass both short-range direct and long-range indirect interaction effects while attention mechanisms capture levels of interaction effects. On the very well-known CASF-2016 dataset, our proposed method, named Distance plus Attention for Affinity Prediction (DAAP), significantly outperforms existing methods by achieving Correlation Coefficient (R) 0.909, Root Mean Squared Error (RMSE) 0.987, Mean Absolute Error (MAE) 0.745, Standard Deviation (SD) 0.988, and Concordance Index (CI) 0.876. The proposed method also shows substantial improvement, around 2% to 37%, on five other benchmark datasets. The program and data are publicly available on the website https://gitlab.com/mahnewton/daap.Scientific Contribution StatementThis study innovatively introduces distance-based features to predict protein-ligand binding affinity, capitalizing on unique molecular interactions. Furthermore, the incorporation of protein sequence features of specific residues enhances the model’s proficiency in capturing intricate binding patterns. The predictive capabilities are further strengthened through the use of a deep learning architecture with attention mechanisms, and an ensemble approach, averaging the outputs of five models, is implemented to ensure robust and reliable predictions.

show abstract

Section: Methodsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Distance plus attention for binding affinity prediction

Rahman,

Newton,

Ali

et al. 2024

J Cheminform

Self Cite

View full text Add to dashboard Cite

show abstract

“…To balance between 3D structural information and simplicity, 2D representation via an attributed graph can be used. For example, in the case of protein, the distance/contact between residues can be predicted [ 42 , 43 ] to form the contact/distance map. The contact/distance map is then used as an adjacency matrix of an attributed graph where each node represents a residue and edges represent the contact/distance between residues.…”

Section: Learning Representationsmentioning

confidence: 99%

Learning to discover medicines

Nguyen

Tran

2022

Int J Data Sci Anal

View full text Add to dashboard Cite

Discovering new medicines is the hallmark of the human endeavor to live a better and longer life. Yet the pace of discovery has slowed down as we need to venture into more wildly unexplored biomedical space to find one that matches today’s high standard. Modern AI-enabled by powerful computing, large biomedical databases, and breakthroughs in deep learning offers a new hope to break this loop as AI is rapidly maturing, ready to make a huge impact in the area. In this paper, we review recent advances in AI methodologies that aim to crack this challenge. We organize the vast and rapidly growing literature on AI for drug discovery into three relatively stable sub-areas: (a) representation learning over molecular sequences and geometric graphs; (b) data-driven reasoning where we predict molecular properties and their binding, optimize existing compounds, generate de novo molecules, and plan the synthesis of target molecules; and (c) knowledge-based reasoning where we discuss the construction and reasoning over biomedical knowledge graphs. We will also identify open challenges and chart possible research directions for the years to come.

show abstract

“…The widespread application of distance map prediction has attracted extensive attention from researchers. Barger et al [26] and Rahman et al [27] develop extended 1 Helices can be identified by thickening of the diagonal line on the distance map, while parallel and antiparallel β-folds can be characterized by lines parallel or orthogonal to the diagonal line of the distance map, respectively. 2 Two or more secondary structural units are connected by a connecting polypeptide (loop) to form further a local spatial structure with a special geometric arrangement.…”

Section: Introductionmentioning

confidence: 99%

Freeprotmap: waiting-free prediction method for protein distance map

Huang,

Li,

Chen

et al. 2024

BMC Bioinformatics

View full text Add to dashboard Cite

Background Protein residue–residue distance maps are used for remote homology detection, protein information estimation, and protein structure research. However, existing prediction approaches are time-consuming, and hundreds of millions of proteins are discovered each year, necessitating the development of a rapid and reliable prediction method for protein residue–residue distances. Moreover, because many proteins lack known homologous sequences, a waiting-free and alignment-free deep learning method is needed. Result In this study, we propose a learning framework named FreeProtMap. In terms of protein representation processing, the proposed group pooling in FreeProtMap effectively mitigates issues arising from high-dimensional sparseness in protein representation. In terms of model structure, we have made several careful designs. Firstly, it is designed based on the locality of protein structures and triangular inequality distance constraints to improve prediction accuracy. Secondly, inference speed is improved by using additive attention and lightweight design. Besides, the generalization ability is improved by using bottlenecks and a neural network block named local microformer. As a result, FreeProtMap can predict protein residue–residue distances in tens of milliseconds and has higher precision than the best structure prediction method. Conclusion Several groups of comparative experiments and ablation experiments verify the effectiveness of the designs. The results demonstrate that FreeProtMap significantly outperforms other state-of-the-art methods in accurate protein residue–residue distance prediction, which is beneficial for lots of protein research works. It is worth mentioning that we could scan all proteins discovered each year based on FreeProtMap to find structurally similar proteins in a short time because the fact that the structure similarity calculation method based on distance maps is much less time-consuming than algorithms based on 3D structures.

show abstract

Enhancing protein inter-residue real distance prediction by scrutinising deep learning models

Cited by 8 publications

References 62 publications

Distance plus attention for binding affinity prediction

Distance plus attention for binding affinity prediction

Learning to discover medicines

Freeprotmap: waiting-free prediction method for protein distance map

Contact Info

Product

Resources

About