Graph Wavelet Alignment Kernels for Drug Virtual Screening

Smalter, Aaron; Huan, Jun; Lushington, Gerald H.

doi:10.1142/9781848162648_0029

Cited by 6 publications

(8 citation statements)

References 11 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In the previously mentioned node extraction method, we ignore the neighborhood topology information of the chemical compound by focusing on atom physical and chemical properties. To add neighborhood topology information, we utilize a technique called the graph wavelet analysis, as originally presented in [ 21 ]. The output of the wavelet analysis is a vector of local feature averages, with the size of the vector controlled by a diffusion parameter d .…”

Section: Methodsmentioning

confidence: 99%

Application of kernel functions for accurate similarity search in large chemical databases

et al. 2010

Self Cite

View full text Add to dashboard Cite

BackgroundSimilaritysearch in chemical structure databases is an important problem with many applications in chemical genomics, drug design, and efficient chemical probe screening among others. It is widely believed that structure based methods provide an efficient way to do the query. Recently various graph kernel functions have been designed to capture the intrinsic similarity of graphs. Though successful in constructing accurate predictive and classification models, graph kernel functions can not be applied to large chemical compound database due to the high computational complexity and the difficulties in indexing similarity search for large databases. ResultsTo bridge graph kernel function and similarity search in chemical databases, we applied a novel kernel-based similarity measurement, developed in our team, to measure similarity of graph represented chemicals. In our method, we utilize a hash table to support new graph kernel function definition, efficient storage and fast search. We have applied our method, named G-hash, to large chemical databases. Our results show that the G-hash method achieves state-of-the-art performance for k-nearest neighbor (k-NN) classification. Moreover, the similarity measurement and the index structure is scalable to large chemical databases with smaller indexing size, and faster query processing time as compared to state-of-the-art indexing methods such as Daylight fingerprints, C-tree and GraphGrep.ConclusionsEfficient similarity query processing method for large chemical databases is challenging since we need to balance running time efficiency and similarity search accuracy. Our previous similarity search method, G-hash, provides a new way to perform similarity search in chemical databases. Experimental study validates the utility of G-hash in chemical databases.

show abstract

Section: Methodsmentioning

confidence: 99%

Application of kernel functions for accurate similarity search in large chemical databases

et al. 2010

Self Cite

View full text Add to dashboard Cite

show abstract

“…The optimal assignment kernel [11] computes optimal assignment between two attributed molecular graphs and uses that similarity score as the value of the kernel function for classification. In [10], graph wavelet alignment kernels were proposed and experiments showed that they can achieve comparable performances like optimal assignment methods but are generally faster.…”

Section: B Graph Kernelsmentioning

confidence: 99%

“…Using graph representations of chemical structures have become popular in recent research [10] [11]. It is easy to model chemical compounds by a graph representation where nodes are usually used to model atoms in the chemical structure and edges are used to model bonds in the chemical structure.…”

Section: Introductionmentioning

confidence: 99%

“…It is also possible to incorporate several kinds of similarity features on graphs based on different data types. [12] Most graph kernel functions, when applied to QSAR models, aim to compute the similarity score or the alignment of two chemical molecules viewed as attributed molecular graphs [12] [11] [10]. A major difficulty in the previous work is how to design the graph kernels, so that their design can be domain independent in order to save the designers' time and effort.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Predicting chemical activities from structures by attributed molecular graph classification

Xue

et al. 2010

2010 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology

View full text Add to dashboard Cite

Designing Quantitative Structure-Activity Relationship (QSAR) models has been a recurrent research interest for biologists and computer scientists. An example is to predict the toxicity of chemical compounds using their structural properties as features represented by graphs. A popular method to classify these graphs is to exploit classifiers such as support vector machines (SVMs) and graph kernels to incorporate the sequential, structural and chemical information. Previous works have focused on designing specific graph kernels for this task, amongst which graph alignment kernels are one of the most popular approach. Graph alignment kernels align the nodes of one graph to the nodes of the second graph so that the total overall similarity is maximized with respect to all possible alignments. However, taking both vertex and edge similarities into account makes the problem NP-Hard. In this paper, we present a novel general graph-matching based method for QSAR. We view the problem of calculating optimal assignments of two attributed graphs from a different perspective. Instead of first designing an atom kernel function and a bond kernel function, we first provide a training set of pairs of graphs with their corresponding matchings. We then try to learn the compatibility function over atoms and use only the atom kernel function to compute graph matchings. Our algorithm has the advantage of being more general and yet efficient than previous approaches for the QSAR problem. We evaluate our method on a set of chemical structure-activity prediction benchmark datasets, and show that our algorithm can achieve better or comparable accuracies over the optimal assignment kernel method.

show abstract

“…Crovella and Kolaczyk used graph wavelet theory to analyze the network traffic data [50]. Smalter et al applied graph wavelet theory to drug virtual screening [51].…”

Section: Introductionmentioning

confidence: 99%

Identification of DNA-Binding and Protein-Binding Proteins Using Enhanced Graph Wavelet Features

Zhu

Zhou

Dai

et al. 2013

IEEE/ACM Trans. Comput. Biol. and Bioinf.

View full text Add to dashboard Cite

Interactions between biomolecules play an essential role in various biological processes. For predicting DNA-binding or protein-binding proteins, many machine-learning-based techniques have used various types of features to represent the interface of the complexes, but they only deal with the properties of a single atom in the interface and do not take into account the information of neighborhood atoms directly. This paper proposes a new feature representation method for biomolecular interfaces based on the theory of graph wavelet. The enhanced graph wavelet features (EGWF) provides an effective way to characterize interface feature through adding physicochemical features and exploiting a graph wavelet formulation. Particularly, graph wavelet condenses the information around the center atom, and thus enhances the discrimination of features of biomolecule binding proteins in the feature space. Experiment results show that EGWF performs effectively for predicting DNA-binding and protein-binding proteins in terms of Matthew's correlation coefficient (MCC) score and the area value under the receiver operating characteristic curve (AUC).

show abstract

Graph Wavelet Alignment Kernels for Drug Virtual Screening

Cited by 6 publications

References 11 publications

Application of kernel functions for accurate similarity search in large chemical databases

Application of kernel functions for accurate similarity search in large chemical databases

Predicting chemical activities from structures by attributed molecular graph classification

Identification of DNA-Binding and Protein-Binding Proteins Using Enhanced Graph Wavelet Features

Contact Info

Product

Resources

About