An Introduction to Recursive Neural Networks and Kernel Methods for Cheminformatics

Micheli, Alessio; Sperduti, Alessandro; Starita, Antonina

doi:10.2174/138161207780765981

Cited by 14 publications

(10 citation statements)

References 61 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…An essential element of the proposed method is thus a graph-based representation of our object of interest, namely a protein. With their long and successful story both in the field of coarse-graining ( Gfeller and Rios, 2007 ; Webb et al, 2019 ; Li et al, 2020 ) and in the prediction of protein properties ( Borgwardt et al, 2005 ; Ralaivola et al, 2005 ; Micheli et al, 2007 ; Fout et al, 2017 ; Gilmer et al, 2017 ; Torng and Altman, 2019 ), graph-based learning models represent a rather natural and common choice to encode the (static) features of a molecular structure; here, we show that a graph-based machine learning approach can reproduce the results of mapping entropy estimate obtained by means of a much more time-consuming algorithmic workflow. To this end, we rely on Deep Graph Networks (DGNs) ( Bacciu et al, 2020 ), a family of machine learning models that learn from graph-structured data, where the graph has a variable size and topology; by training the model on a set of tuples (protein, CG mapping, and S map ), we can infer the S map values of unseen mappings associated with the same protein making use of a tiny fraction of the extensive amount of information employed in the original method, i.e., the molecular structure viewed as a graph.…”

Section: Introductionmentioning

confidence: 99%

A Deep Graph Network–Enhanced Sampling Approach to Efficiently Explore the Space of Reduced Representations of Proteins

Errica

Giulini

Bacciu

et al. 2021

Front. Mol. Biosci.

Self Cite

View full text Add to dashboard Cite

The limits of molecular dynamics (MD) simulations of macromolecules are steadily pushed forward by the relentless development of computer architectures and algorithms. The consequent explosion in the number and extent of MD trajectories induces the need for automated methods to rationalize the raw data and make quantitative sense of them. Recently, an algorithmic approach was introduced by some of us to identify the subset of a protein’s atoms, or mapping, that enables the most informative description of the system. This method relies on the computation, for a given reduced representation, of the associated mapping entropy, that is, a measure of the information loss due to such simplification; albeit relatively straightforward, this calculation can be time-consuming. Here, we describe the implementation of a deep learning approach aimed at accelerating the calculation of the mapping entropy. We rely on Deep Graph Networks, which provide extreme flexibility in handling structured input data and whose predictions prove to be accurate and-remarkably efficient. The trained network produces a speedup factor as large as 105 with respect to the algorithmic computation of the mapping entropy, enabling the reconstruction of its landscape by means of the Wang–Landau sampling scheme. Applications of this method reach much further than this, as the proposed pipeline is easily transferable to the computation of arbitrary properties of a molecular structure.

show abstract

Section: Introductionmentioning

confidence: 99%

A Deep Graph Network–Enhanced Sampling Approach to Efficiently Explore the Space of Reduced Representations of Proteins

Errica

Giulini

Bacciu

et al. 2021

Front. Mol. Biosci.

Self Cite

View full text Add to dashboard Cite

show abstract

“…For example, parse trees arise in natural language processing tasks where a parse tree or a semantic related tree structure is generated starting from a sentence [1], [2]; moreover, tree-like representations/patterns can be naturally derived, for example, from documents (e.g. [3]) and HTML/XML documents in information retrieval [4], [5], [6], structured network data in computer security [7], molecule structures in computational chemistry [8], [9], and image analysis. In all these application domains, learning plays a crucial role since very often the user is interested in automatic classification/regression tasks where, starting from a set of labeled instances, a classifier/regressor is pursued.…”

Section: Introductionmentioning

confidence: 99%

Generative Kernels for Tree-Structured Data

Bacciu

Micheli

Sperduti

2018

IEEE Trans. Neural Netw. Learning Syst.

Self Cite

View full text Add to dashboard Cite

This paper presents a family of methods for the design of adaptive kernels for tree-structured data that exploits the summarization properties of hidden states of hidden Markov models for trees. We introduce a compact and discriminative feature space based on the concept of hidden states multisets and we discuss different approaches to estimate such hidden state encoding. We show how it can be used to build an efficient and general tree kernel based on Jaccard similarity. Furthermore, we derive an unsupervised convolutional generative kernel using a topology induced on the Markov states by a tree topographic mapping. This paper provides an extensive empirical assessment on a variety of structured data learning tasks, comparing the predictive accuracy and computational efficiency of state-of-the-art generative, adaptive, and syntactical tree kernels. The results show that the proposed generative approach has a good tradeoff between computational complexity and predictive performance, in particular when considering the soft matching introduced by the topographic mapping.

show abstract

“…The algorithms we describe are based on recursive neural networks and they deal with molecules directly as graphs, in that no features are manually extracted from the structure, and the networks automatically identify regions and substructures of the molecules that are relevant for the property in question. The basic structural processing cell we use is similar to those described in [15,16,17,18], and adopted in essentially the same form in applications including molecule regression/classification [19,20,21], image classification [22], natural language processing [23], face recognition [24]. In the case of molecules, there are numerous disadvantages in these earlier models: they can only deal with trees, thus molecules (that are more naturally described as Undirected Graphs (UG)) have to be preprocessed before being input; the preprocessing is generally task-dependent; special nodes ("super-sources") have to be defined for each molecule; application domains are generally limited, thus the effectiveness of the models is hard to gauge.…”

Section: Introductionmentioning

confidence: 99%

Recursive Neural Networks for Undirected Graphs for Learning Molecular Endpoints

Walsh

Vullo

Pollastri

2009

Pattern Recognition in Bioinformatics

View full text Add to dashboard Cite

Abstract. Accurately predicting the endpoints of chemical compounds is an important step towards drug design and molecular screening in particular.Here we develop a recursive architecture that is capable of mapping Undirected Graphs into individual labels, and apply it to the prediction of a number of different properties of small molecules. The results we obtain are generally state-of-the-art.The final model is completely general and may be applied not only to prediction of molecular properties, but to a vast range of problems in which the input is a graph and the output is either a single property or (with small modifications) a set of properties of the nodes.

show abstract

An Introduction to Recursive Neural Networks and Kernel Methods for Cheminformatics

Cited by 14 publications

References 61 publications

A Deep Graph Network–Enhanced Sampling Approach to Efficiently Explore the Space of Reduced Representations of Proteins

A Deep Graph Network–Enhanced Sampling Approach to Efficiently Explore the Space of Reduced Representations of Proteins

Generative Kernels for Tree-Structured Data

Recursive Neural Networks for Undirected Graphs for Learning Molecular Endpoints

Contact Info

Product

Resources

About