Clustering and Embedding Using Commute Times

Qiu, Huaijun; Hancock, Edwin R.

doi:10.1109/tpami.2007.1103

Cited by 195 publications

(120 citation statements)

References 35 publications

Supporting

Mentioning

118

Contrasting

Order By: Relevance

“…For example, the algorithm we propose for computing the Katz and commute time between a given pair of nodes extends to the case where one wants to find the aggregate score between a node and a set of nodes. This could be useful in methods that find clusters using commute time [16,17,25]. In these cases, the commute time between a node and a group of nodes (e.g., a cluster) measures their affinity.…”

Section: Discussionmentioning

confidence: 99%

Fast Katz and Commuters: Efficient Estimation of Social Relatedness in Large Networks

Esfandiar

Bonchi

Gleich

et al. 2010

Algorithms and Models for the Web-Graph

View full text Add to dashboard Cite

Abstract. Motivated by social network data mining problems such as link prediction and collaborative filtering, significant research effort has been devoted to computing topological measures including the Katz score and the commute time. Existing approaches typically approximate all pairwise relationships simultaneously. In this paper, we are interested in computing: the score for a single pair of nodes, and the top-k nodes with the best scores from a given source node. For the pairwise problem, we apply an iterative algorithm that computes upper and lower bounds for the measures we seek. This algorithm exploits a relationship between the Lanczos process and a quadrature rule. For the top-k problem, we propose an algorithm that only accesses a small portion of the graph and is related to techniques used in personalized PageRank computing. To test the scalability and accuracy of our algorithms we experiment with three real-world networks and find that these algorithms run in milliseconds to seconds without any preprocessing.

show abstract

Section: Discussionmentioning

confidence: 99%

Fast Katz and Commuters: Efficient Estimation of Social Relatedness in Large Networks

Esfandiar

Bonchi

Gleich

et al. 2010

Algorithms and Models for the Web-Graph

View full text Add to dashboard Cite

show abstract

“…the maximal hop in a minimal path between objects (30), reminiscent of commute times in a graph (31). Because the paths in the MST are minimal paths, the SL dendrogram can be constructed efficiently from the MST in practice (21).…”

Section: Methodsmentioning

confidence: 99%

Tree preserving embedding

Shieh

Hashimoto

Airoldi

2011

Proc. Natl. Acad. Sci. U.S.A.

View full text Add to dashboard Cite

The goal of dimensionality reduction is to embed high-dimensional data in a low-dimensional space while preserving structure in the data relevant to exploratory data analysis such as clusters. However, existing dimensionality reduction methods often either fail to separate clusters due to the crowding problem or can only separate clusters at a single resolution. We develop a new approach to dimensionality reduction: tree preserving embedding. Our approach uses the topological notion of connectedness to separate clusters at all resolutions. We provide a formal guarantee of cluster separation for our approach that holds for finite samples. Our approach requires no parameters and can handle general types of data, making it easy to use in practice and suggesting new strategies for robust data visualization.hierarchical clustering | multidimensional scaling V isualization is an important first step in the analysis of highdimensional data (1). High-dimensional data often has low intrinsic dimensionality, making it possible to embed the data in a low-dimensional space while preserving much of its structure (2). However, it is rarely possible to preserve all types of structure in the embedding. Therefore, dimensionality reduction methods can only aim to preserve particular types of structure. Linear methods such as principal component an alysis (PCA) (3) and classical multidimensional scaling (MDS) (4-6) preserve global distances, while nonlinear methods such as manifold learning methods (7-9) preserve local distances defined by kernels or neighborhood graphs. However, most dimensionality reduction methods fail to preserve clusters (10), which are often of greatest interest.Clusters are difficult to preserve in embeddings due to the so-called crowding problem (11). When the intrinsic dimensionality of the data exceeds the embedding dimensionality, there is not enough space in the embedding to allow clusters to separate. Therefore, clusters are forced to collapse on top of each other in the embedding. As the embedding dimensionality increases, there is more space in the embedding for clusters to separate and the crowding problem disappears, making it possible to preserve clusters exactly (12). However, because the embedding dimensionality is at most two or three for visualization purposes, the crowding problem is prevalent in practice. When the clusters are known, they can be used to guide the embedding to avoid the crowding problem (13). However, the embedding is often used to help find the clusters in the first place. Therefore, it is important to solve the crowding problem without knowledge of the clusters.Force-based methods such as stochastic neighbor embedding (SNE) (14), variants of SNE (10,11,15,16), and local MDS (17) have been proposed to overcome the crowding problem. Force-based methods use attractive forces to pull together similar points and repulsive forces to push apart dissimilar points. SNE and its variants use forces based on kernels, while local MDS uses forces based on neighborhood graphs. Force-base...

show abstract

“…Pairwise potentials defined based on different metrics (e.g., geodesic [12], diffusion met-rics [13] and commute time [31]) can also be considered in this general formulation to integrate more geometric information towards improving the performance.…”

Section: Non-rigid 3d Surface Matchingmentioning

confidence: 99%

Modeling Shapes with Higher-Order Graphs: Methodology and Applications

Wang

Zeng

Samaras

et al. 2013

Shape Perception in Human and Computer Vision

View full text Add to dashboard Cite

Extrinsic factors such as object pose and camera parameters are a main source of shape variability and pose an obstacle to efficiently solving shape matching and inference. Most existing methods address the influence of extrinsic factors by decomposing the transformation of the source shape (model) into two parts: one corresponding to the extrinsic factors and the other accounting for intra-class variability and noise, which are solved in a successive or alternating manner. In this chapter, we consider a methodology to circumvent the influence of extrinsic factors by exploiting shape properties that are invariant to them. Based on higher-order graph-based models, we implement such a methodology to address various important vision problems, such as non-rigid 3D surface matching and knowledge-based 3D segmentation, in a one-shot optimization scheme. Experimental results demonstrate the superior performance and potential of this type of approach.

show abstract

Clustering and Embedding Using Commute Times

Cited by 195 publications

References 35 publications

Fast Katz and Commuters: Efficient Estimation of Social Relatedness in Large Networks

Fast Katz and Commuters: Efficient Estimation of Social Relatedness in Large Networks

Tree preserving embedding

Modeling Shapes with Higher-Order Graphs: Methodology and Applications

Contact Info

Product

Resources

About