2020
DOI: 10.48550/arxiv.2010.16027
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

PersGNN: Applying Topological Data Analysis and Geometric Deep Learning to Structure-Based Protein Function Prediction

Abstract: Understanding protein structure-function relationships is a key challenge in computational biology, with applications across the biotechnology and pharmaceutical industries. While it is known that protein structure directly impacts protein function, many functional prediction tasks use only protein sequence. In this work, we isolate protein structure to make functional annotations for proteins in the Protein Data Bank in order to study the expressiveness of different structure-based prediction schemes. We pres… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4

Citation Types

0
8
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
2
1
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(8 citation statements)
references
References 15 publications
0
8
0
Order By: Relevance
“…3a). In addition to graphs, we calculate 1D and 2D persistence diagrams of C α atoms in each protein, the coarsened topological features measured by persistent homology 20,28 , and embed the diagrams into persistence images via Gaussian kernels and pixelated integrals 29 (see Methods). Our model takes input from three feature pipelines: (1) spatial information from the contact map where residue pairs are closer than cutoff distance r c such as 8 Å and 12 Å, (2) topological information from the persistence image, which is preprocessed as vectors of 625 dimensions, and (3) dynamic information based on the correlation map such that residue pairs with absolute correlation no less than 0.5 are connected with correlation edges if they are not in contact.…”
Section: Dynamics-informed Representation Increases the Discriminator...mentioning
confidence: 99%
See 4 more Smart Citations
“…3a). In addition to graphs, we calculate 1D and 2D persistence diagrams of C α atoms in each protein, the coarsened topological features measured by persistent homology 20,28 , and embed the diagrams into persistence images via Gaussian kernels and pixelated integrals 29 (see Methods). Our model takes input from three feature pipelines: (1) spatial information from the contact map where residue pairs are closer than cutoff distance r c such as 8 Å and 12 Å, (2) topological information from the persistence image, which is preprocessed as vectors of 625 dimensions, and (3) dynamic information based on the correlation map such that residue pairs with absolute correlation no less than 0.5 are connected with correlation edges if they are not in contact.…”
Section: Dynamics-informed Representation Increases the Discriminator...mentioning
confidence: 99%
“…We also construct two datasets with different cutoff distances r c = 8 Å and r c = 12 Å to observe the effect of the cutoff on the classification performance. The choices of cutoff distance for graph-based protein feature extraction vary from approximately 6 to 15 Å in the literature 17,20,32 . Here we follow the cutoff 8 Å used by PersGNN 20 for a fair comparison.…”
Section: Dynamics-informed Representation Increases the Discriminator...mentioning
confidence: 99%
See 3 more Smart Citations