Jason Tsong-Li Wang scite author profile

A distance-mapping algorithm takes a set of objects and a distance metric and then maps those objects to a Euclidean or pseudoEuclidean space in such a way that the distances among objects are approximately preserved. Distance mapping algorithms are a useful tool for clustering and visualization in data intensive applications, because they replace expensive distance calculations by sum-of-square calculations.This can make clustering in large databases with expensive distance metrics practical.In this paper we present five distance-mapping algorithms and conduct experiments to compare their performance in data clustering applications.These include two algorithms called FastMap and MetricMap, and three hybrid heuristics that combine the two algorithms in different ways. Experimental results on both synthetic and RNA data show the superiority of the hybrid algorithms. The results imply that FastMap and MetricMap capture complementary information about distance metrics and therefore can be used together to great benefit. The net effect is that multi-day computations may be done in minutes. *

show abstract

Combinatorial pattern discovery for scientific data

Wang

Chirn

Marr

et al. 1994

102

View full text Add to dashboard Cite

show abstract

On the Editing Distance Between Undirected Acyclic Graphs

Zhang

Wang

Shasha

1996

Int. J. Found. Comput. Sci.

View full text Add to dashboard Cite

We consider the problem of comparing CUAL graphs (Connected, Undirected, Acyclic graphs with nodes being Labeled). This problem is motivated by the study of information retrieval for bio-chemical and molecular databases. Suppose we define the distance between two CUAL graphs G1 and G2 to be the weighted number of edit operations (insert node, delete node and relabel node) to transform G1 to G2. By reduction from exact cover by 3-sets, one can show that finding the distance between two CUAL graphs is NP-complete. In view of the hardness of the problem, we propose a constrained distance metric, called the degree-2 distance, by requiring that any node to be inserted (deleted) have no more than 2 neighbors. With this metric, we present an efficient algorithm to solve the problem. The algorithm runs in time O(N1N2D2) for general weighting edit operations and in time [Formula: see text] for integral weighting edit operations, where Ni, i=1, 2, is the number of nodes in Gi, D=min{d1, d2} and di is the maximum degree of Gi.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.