As a diffusion distance, we propose to use a metric closely related to cosine similarity which is defined as the L 2 distance between two L 2 -normalized vectors. We provide a mathematical explanation as to why the normalization makes diffusion distances more meaningful. Our proposal is in contrast to that made some years ago by R. Coifman which finds the L 2 distance between certain L 1 unit vectors. In the second part of the paper, we give two proofs that an extension of mean first passage time to mean first passage cost satisfies the triangle inequality; we do not assume that the underlying Markov matrix is diagonalizable. We conclude by exhibiting an interesting connection between the normalized mean first passage time and the discretized solution of a certain Dirichlet-Poisson problem and verify our result numerically for the simple case of the unit circle.
2Journal of Applied Mathematics consider to be far apart, are actually close to each other in L 2 , even though the angle between them is large, because they have small L 2 norm, while still having unit L 1 norm. Additionally, applying Coifman's distance to heat flow in R n , a factor of a power of time t remains, with the exponent depending on the dimension n. It would be desirable not to have such a factor.Our main motivation for this paper is to propose an alternate diffusion metric, which finds the L 2 distance between two L 2 unit vectors with analogous statements for the discrete case . Our distance is thus the length of the chord joining the tips, on the unit hypersphere, of two L 2 normalized diffusion vectors, and is therefore based on cosine similarity see 4.4 below . Cosine similarity affinity is popular in kernel methods in machine learning; see for example, 5, 6 in particular, Section 3.5.1-Document Clustering Basics and for a review of kernel methods in machine learning, 7 .In the case of heat flow on R n , our proposed distance has the property that no dimensionally dependent factor is left. Furthermore, for a general manifold, our diffusion distance gives, approximately, a scaled geodesic distance between two points x and y, when x and y are closer than √ t, and maximum separation when the geodesic distance between x and y, scaled by √ t, goes to infinity. We next give two proofs that the mean first passage cost-defined later in this paper as the cost to visit a particular point for the first time after leaving a specified point-satisfies the triangle inequality. See Theorem 4.2 in 8 in which the author states that the triangle inequality holds for the mean first passage time. We give two proofs that do not assume that the underlying Markov matrix is diagonalizable; our proofs do not rely on spectral theory. We calculate explicitly the normalized limit of the mean first passage time for the unit circle S 1 by identifying the limit as the solution of a specific Dirichlet-Poisson problem on S 1 . We also provide numerical verification of our calculation.The paper is organized as follows. After a section on notation, we discuss R. Coifman's d...