RASCAL: Calculation of Graph Similarity using Maximum Common Edge Subgraphs

Raymond, John W.

doi:10.1093/comjnl/45.6.631

Cited by 272 publications

(262 citation statements)

References 29 publications

Supporting

Mentioning

260

Contrasting

Unclassified

Order By: Relevance

“…This is a computationally expensive process, and thus there are a number of strategies to simplify the computations. For instance Raymond et al (2002) propose to first compute an upper bound of the similarity measure, and only compute the actual MCS for those molecules for which the upper bound is over a given threshold. Computing the MCS is a very related problem to that of finding the anti-unification in refinement graphs, and thus S λ is related to graph-based similarities for molecules.…”

Section: Related Workmentioning

confidence: 99%

Similarity measures over refinement graphs

Ontañón

Plaza

2011

Mach Learn

View full text Add to dashboard Cite

Similarity also plays a crucial role in support vector machines. Similarity assessment plays a key role in lazy learning methods such as k-nearest neighbor or case-based reasoning. In this paper we will show how refinement graphs, that were originally introduced for inductive learning, can be employed to assess and reason about similarity. We will define and analyze two similarity measures, S λ and S π , based on refinement graphs. The anti-unification-based similarity, S λ , assesses similarity by finding the anti-unification of two instances, which is a description capturing all the information common to these two instances. The property-based similarity, S π , is based on a process of disintegrating the instances into a set of properties, and then analyzing these property sets. Moreover these similarity measures are applicable to any representation language for which a refinement graph that satisfies the requirements we identify can be defined. Specifically, we present a refinement graph for feature terms, in which several languages of increasing expressiveness can be defined. The similarity measures are empirically evaluated on relational data sets belonging to languages of different expressiveness.

show abstract

Section: Related Workmentioning

confidence: 99%

Similarity measures over refinement graphs

Ontañón

Plaza

2011

Mach Learn

View full text Add to dashboard Cite

show abstract

“…Most of these algorithms avoid the computational complexity by computing approximate values. Raymond et al [14] also propose an exact multi-step algorithm which defines a similarity based on computing the maximum common subgraph. This algorithm is theoretically still NP-complete, but makes use of advanced heuristics to reduce the number of matchings required.…”

Section: Related Workmentioning

confidence: 99%

“…Our method is based on this idea of graph similarity in function of the maximum common subgraph, and thus shares the intuitiveness of [14], although the latter requires an advanced graph-theoretical problem transformation and is difficult to implement. Another difference is that, by using the BBP subgraph isomorphism of [8], we can obtain a polynomial algorithm which only takes into account a subset of matchings and in this way imposes a bias on the features that will be used for computing the similarity.…”

Section: Related Workmentioning

confidence: 99%

An Efficiently Computable Graph-Based Metric for the Classification of Small Molecules

et al. 2008

View full text Add to dashboard Cite

Abstract. In machine learning, there has been an increased interest in metrics on structured data. The application we focus on is drug discovery. Although graphs have become very popular for the representation of molecules, a lot of operations on graphs are NP-complete. Representing the molecules as outerplanar graphs, a subclass within general graphs, and using the block-and-bridge preserving subgraph isomorphism, we define a metric and we present an algorithm for computing it in polynomial time. We evaluate this metric and more generally also the blockand-bridge preserving matching operator on a large dataset of molecules, obtaining favorable results.

show abstract

“…This provides a natural way of calculating the degree of similarity between a pair of molecules but the NP-complete nature of the maximum common subgraph isomorphism problem has ruled out the large-scale use of MCS-based similarities. We have recently described a new MCS algorithm, called RASCAL, that is sufficiently rapid in execution to permit graph-based similarity searching of large chemical databases 16,17 and that seems to provide a viable complement, or even an alternative, to existing, fingerprint-based approaches to virtual screening 18 .…”

Section: Introductionmentioning

confidence: 99%

Comparison of chemical clustering methods using graph- and fingerprint-based similarity measures

Raymond

Blankley

Willett

2003

Journal of Molecular Graphics and Modelling

View full text Add to dashboard Cite

This paper compares several published methods for clustering chemical structures, using both fingerprint-based and graph-based similarity measures. The clusterings from each method were compared to determine the degree of cluster overlap. Each method was also evaluated on how well it grouped structures into clusters possessing a non-trivial substructural commonality. The methods which employ adjustable parameters were tested to determine the stability of each parameter for datasets of varying size and composition. Our experiments suggest that both fingerprint-based and graph-based similarity measures can be used effectively for generating chemical clusterings; it is also suggested that the CAST method, suggested recently for the clustering of gene expression patterns, may also prove effective for the clustering of 2D chemical structures.

show abstract

RASCAL: Calculation of Graph Similarity using Maximum Common Edge Subgraphs

Cited by 272 publications

References 29 publications

Similarity measures over refinement graphs

Similarity measures over refinement graphs

An Efficiently Computable Graph-Based Metric for the Classification of Small Molecules

Comparison of chemical clustering methods using graph- and fingerprint-based similarity measures

Contact Info

Product

Resources

About