A Novel Geometry-Based Approach to Infer Protein Interface Similarity

Budowski-Tal, Inbal; Kolodny, Rachel; Mandel-Gutfreund, Yael

doi:10.1038/s41598-018-26497-z

Cited by 8 publications

(11 citation statements)

References 68 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Such transformations impose a large overhead for computational methods. Several algorithms were reported to overcome the computational complexity arising from the spatial degrees of freedom ( Nussinov and Wolfson 1991 ; Lin and Nussinov 1996 ; Brakoulias and Jackson 2004 ; Shulman-Peleg et al 2004 ; Morris et al 2005 ; Gold and Jackson 2006 ; Wallace et al 2008 , Venkatraman et al 2009 ; Yin et al 2009 ; Kihara et al 2011 ; Zhu et al 2015 ; Budowski-Tal et al 2018 ; Daberdaku and Ferrari 2019 ). However, these methods rely on human-crafted descriptors and parameters based on heuristics, which may not be optimal in capturing the full complexity of molecular surfaces.…”

Section: Introductionmentioning

confidence: 99%

Surface ID: a geometry-aware system for protein molecular surface comparison

Riahi

Lee²,

Sorenson³

et al. 2023

Bioinformatics

View full text Add to dashboard Cite

Motivation A protein can be represented in several forms, including its 1D sequence, 3D atom coordinates, and molecular surface. A protein surface contains rich structural and chemical features directly related to the protein’s function such as its ability to interact with other molecules. While many methods have been developed for comparing similarity of proteins using the sequence and structural representations, computational methods based on molecular surface representation are limited. Results Here, we describe “Surface ID”, a geometric deep learning system for high-throughput surface comparison based on geometric and chemical features. Surface ID offers a novel grouping and alignment algorithm useful for clustering proteins by function, visualization, and in-silico screening of potential binding partners to a target molecule. Our method demonstrates top performance in surface similarity assessment, indicating great potential for protein functional annotation, a major need in protein engineering and therapeutic design. Availability Source code for the Surface ID model, trained weights and inference script are available under an open-source (Apache Version 2.0) license at https://github.com/Sanofi-Public/LMR-SurfaceID Supplementary information Supplementary data are available at Bioinformatics online.

show abstract

Section: Introductionmentioning

confidence: 99%

Surface ID: a geometry-aware system for protein molecular surface comparison

Riahi

Lee²,

Sorenson³

et al. 2023

Bioinformatics

View full text Add to dashboard Cite

show abstract

“…For instance, while the GLIDE score incorporates both global and local scores, it would be possible to directly supervise Topsy-Turvy with global and local loss terms, each with a respective hyper-parameter to finely control their effects. Loss terms that quantify protein functional similarity ( Ghersi and Singh, 2014 ) or interface similarity ( Budowski-Tal et al , 2018 ; Gainza et al , 2020 ) could be added to the framework to further inform predictions. Topsy-Turvy demonstrates that a general, scalable framework that allows us to transfer both low-level (sequence-to-structure) and high-level (network topology) insights across species can enable researchers to fill in the missing links in our knowledge of biological function.…”

Section: Discussionmentioning

confidence: 99%

Topsy-Turvy: integrating a global view into sequence-based PPI prediction

et al. 2022

View full text Add to dashboard Cite

Summary Computational methods to predict protein–protein interaction (PPI) typically segregate into sequence-based ‘bottom-up’ methods that infer properties from the characteristics of the individual protein sequences, or global ‘top-down’ methods that infer properties from the pattern of already known PPIs in the species of interest. However, a way to incorporate top-down insights into sequence-based bottom-up PPI prediction methods has been elusive. We thus introduce Topsy-Turvy, a method that newly synthesizes both views in a sequence-based, multi-scale, deep-learning model for PPI prediction. While Topsy-Turvy makes predictions using only sequence data, during the training phase it takes a transfer-learning approach by incorporating patterns from both global and molecular-level views of protein interaction. In a cross-species context, we show it achieves state-of-the-art performance, offering the ability to perform genome-scale, interpretable PPI prediction for non-model organisms with no existing experimental PPI data. In species with available experimental PPI data, we further present a Topsy-Turvy hybrid (TT-Hybrid) model which integrates Topsy-Turvy with a purely network-based model for link prediction that provides information about species-specific network rewiring. TT-Hybrid makes accurate predictions for both well- and sparsely-characterized proteins, outperforming both its constituent components as well as other state-of-the-art PPI prediction methods. Furthermore, running Topsy-Turvy and TT-Hybrid screens is feasible for whole genomes, and thus these methods scale to settings where other methods (e.g. AlphaFold-Multimer) might be infeasible. The generalizability, accuracy and genome-level scalability of Topsy-Turvy and TT-Hybrid unlocks a more comprehensive map of protein interaction and organization in both model and non-model organisms. Availability and implementation https://topsyturvy.csail.mit.edu. Supplementary information Supplementary data are available at Bioinformatics online.

show abstract

“…Several approaches have been proposed to reduce complex 3D information into compact signatures while preserving binding-related spatial features. For example, PatchBag characterized protein interface regions in terms of geometrical features from small surface units to search for evolutionary and functional relationships between proteins [6]. Deep Local Analysis evaluates the 3D conformational information with locally oriented cubes [45].…”

Section: Introductionmentioning

confidence: 99%

PIsToN: Evaluating Protein Binding Interfaces with Transformer Networks

Stebliankin

Shirali

Baral

et al. 2023

Preprint

View full text Add to dashboard Cite

The computational studies of protein binding are widely used to investigate fundamental biological processes and facilitate the development of modern drugs, vaccines, and therapeutics. Scoring functions aim to predict complexes that would be formed by the binding of two biomolecules and to assess and rank the strength of the binding at the interface. Despite past efforts, the accurate prediction and scoring of protein binding interfaces remain a challenge. The physics-based methods are computationally intensive and often have to trade accuracy for computational cost. The possible limitations of current machine learning (ML) methods are ineffective data representation, network architectures, and limited training data. Here, we propose a novel approach called PIsToN (evaluating Protein binding Interfaces with Transformer Networks) that aim to distinguish native-like protein complexes from decoys. Each protein interface is transformed into a collection of 2D images (interface maps), where each im- age corresponds to a geometric or biochemical property in which pixel intensity represents the feature values. Such a data representation provides atomic-level resolution of relevant protein characteristics. To build hybrid machine learning models, additional empirical-based energy terms are computed and provided as inputs to the neural network. The model is trained on thousands of native and computationally-predicted protein complexes that contain challenging examples. The multi-attention transformer network is also endowed with explainability by highlighting the specific features and binding sites that were the most important for the classification decision. The developed PIsToN model significantly outperforms existing state-of-the-art scoring functions on well-known datasets.

show abstract

A Novel Geometry-Based Approach to Infer Protein Interface Similarity

Cited by 8 publications

References 68 publications

Surface ID: a geometry-aware system for protein molecular surface comparison

Surface ID: a geometry-aware system for protein molecular surface comparison

Topsy-Turvy: integrating a global view into sequence-based PPI prediction

PIsToN: Evaluating Protein Binding Interfaces with Transformer Networks

Contact Info

Product

Resources

About