Geometric Deep Learning on Molecular Representations

Atz, Kenneth; Grisoni, Francesca; Schneider, Gisbert

doi:10.48550/arxiv.2107.12375

Cited by 7 publications

(7 citation statements)

References 157 publications

(214 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Molecular properties are roto-translation invariant (Atz et al, 2021). However, some molecules are chiral and their chiral properties are dependent to the absolute configuration of their stereogenic centers, and thus non-invariant to reflection, as we claim in Proposition 1.…”

Section: A3 Encoding Chiral Molecules With Absolute Positionsmentioning

confidence: 66%

Molformer: Motif-based Transformer on 3D Heterogeneous Molecular Graphs

Wu¹,

Zhang²,

Radev³

et al. 2021

Preprint

View full text Add to dashboard Cite

Spatial structures in the 3D space are important to determine molecular properties. Recent papers use geometric deep learning to represent molecules and predict properties. These papers, however, are computationally expensive in capturing long-range dependencies of input atoms; and more importantly, they have not considered the non-uniformity of interatomic distances, thus failing to learn context-dependent representations at different scales. To deal with such issues, we introduce 3D-Transformer, a variant of the Transformer for molecular representations that incorporates 3D spatial information. 3D-Transformer operates on a fully-connected graph with direct connections between atoms. To cope with the non-uniformity of interatomic distances, we develop a multi-scale self-attention module that exploits local fine-grained patterns with increasing contextual scales. As molecules of different sizes rely on different kinds of spatial features, we design an adaptive position encoding module that adopts different position encoding methods for small and large molecules. Finally, to attain the molecular representation from atom embeddings, we propose an attentive farthest point sampling algorithm that selects a portion of atoms with the assistance of attention scores, overcoming handicaps of the virtual node and previous distance-dominant downsampling methods. We validate 3D-Transformer across three important scientific domains: quantum chemistry, material science,and proteomics. Our experiments show significant improvements over state-of-the-art models on the crystal property prediction task and the protein-ligand binding affinity prediction task, and show better or competitive performance in quantum chemistry molecular datasets. This work provides clear evidence that biochemical tasks can gain consistent benefits from 3D molecular representations and different tasks require different position encoding methods.

show abstract

Section: A3 Encoding Chiral Molecules With Absolute Positionsmentioning

confidence: 66%

Molformer: Motif-based Transformer on 3D Heterogeneous Molecular Graphs

Wu¹,

Zhang²,

Radev³

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…In the field of deep learning, geometry-based methods have shown prominent performance (Bronstein et al 2017;Zhou et al 2020;Li et al 2020). Since molecules have geometric structures intrinsically, a few attempts have also been made to develop geometric graph learning models for the molecular graphs (Atz, Grisoni, and Schneider 2021). From the 2D view of the molecular graph, Recent works (Maziarka et al 2020;Ying et al 2021) are designed to encode interatomic distances with augmenting the attention mechanism in a transformer architecture.…”

Section: Geometric Learning On Molecular Graphsmentioning

confidence: 99%

GeomGCL: Geometric Graph Contrastive Learning for Molecular Property Prediction

Zhou

et al. 2022

AAAI

View full text Add to dashboard Cite

Recently many efforts have been devoted to applying graph neural networks (GNNs) to molecular property prediction which is a fundamental task for computational drug and material discovery. One of major obstacles to hinder the successful prediction of molecular property by GNNs is the scarcity of labeled data. Though graph contrastive learning (GCL) methods have achieved extraordinary performance with insufficient labeled data, most focused on designing data augmentation schemes for general graphs. However, the fundamental property of a molecule could be altered with the augmentation method (like random perturbation) on molecular graphs. Whereas, the critical geometric information of molecules remains rarely explored under the current GNN and GCL architectures. To this end, we propose a novel graph contrastive learning method utilizing the geometry of the molecule across 2D and 3D views, which is named GeomGCL. Specifically, we first devise a dual-view geometric message passing network (GeomMPNN) to adaptively leverage the rich information of both 2D and 3D graphs of a molecule. The incorporation of geometric properties at different levels can greatly facilitate the molecular representation learning. Then a novel geometric graph contrastive scheme is designed to make both geometric views collaboratively supervise each other to improve the generalization ability of GeomMPNN. We evaluate GeomGCL on various downstream property prediction tasks via a finetune process. Experimental results on seven real-life molecular datasets demonstrate the effectiveness of our proposed GeomGCL against state-of-the-art baselines.

show abstract

“…Generative ML, the unsupervised learning from input data to generate new data that is similar to the provided data, allows to perform fully datadriven molecule generation and is an active research area (Elton et al, 2019;Faez et al, 2021;Gaudelet et al, 2021;Atz et al, 2021). Various works have developed string-based ML models in order to generate molecules with optimal properties based on SMILES (Kadurin et al, 2017;Gómez-Bombarelli et al, 2018;Mario Krenn et al, 2020;Blaschke et al, 2018;Lim et al, 2018;Bjerrum and Sattarov, 2018;Prykhodko et al, 2019;Griffiths and Hernández-Lobato, 2020), InChI (Winter et al, 2019a), or SELFIES (Mario Krenn et al, 2020), the latter being a more robust string representation of molecules.…”

Section: Generative Modelsmentioning

confidence: 99%

Graph Machine Learning for Design of High-Octane Fuels

Rittig¹,

Ritzert²,

Schweidtmann³

et al. 2022

Preprint

View full text Add to dashboard Cite

Fuels with high-knock resistance enable modern spark-ignition engines to achieve high efficiency and thus low CO 2 emissions. Identification of molecules with desired autoignition properties indicated by a high research octane number and a high octane sensitivity is therefore of great practical relevance and can be supported by computer-aided molecular design (CAMD). Recent developments in the field of graph machine learning (graph-ML) provide novel, promising tools for CAMD. We propose a modular graph-ML CAMD framework that integrates generative graph-ML models with graph neural networks and optimization, enabling the design of molecules with desired ignition properties in a continuous molecular space. In particular, we explore the potential of Bayesian optimization and genetic algorithms in combination with generative graph-ML models. The graph-ML CAMD framework successfully identifies well-established high-octane components. It also suggests new candidates, one of which we experimentally investigate and use to illustrate the need for further auto-ignition training data.

show abstract

Geometric Deep Learning on Molecular Representations

Cited by 7 publications

References 157 publications

Molformer: Motif-based Transformer on 3D Heterogeneous Molecular Graphs

Molformer: Motif-based Transformer on 3D Heterogeneous Molecular Graphs

GeomGCL: Geometric Graph Contrastive Learning for Molecular Property Prediction

Graph Machine Learning for Design of High-Octane Fuels

Contact Info

Product

Resources

About