We introduce a concept of similarity between vertices of directed graphs. Let G A and G B be two directed graphs with respectively n A and n B vertices. We define a n B × n A similarity matrix S whose real entry s ij expresses how similar vertex j (in G A ) is to vertex i (in G B ) : we say that s ij is their similarity score. The similarity matrix can be obtained as the limit of the normalized even iterates of S(k +1) = BS(k)A T +B T S(k)A where A and B are adjacency matrices of the graphs and S(0) is a matrix whose entries are all equal to one. In the special case where G A = G B = G, the matrix S is square and the score s ij is the similarity score between the vertices i and j of G. We point out that Kleinberg's "hub and authority" method to identify web-pages relevant to a given query can be viewed as a special case of our definition in the case where one of the graphs has two vertices and a unique directed edge between them. In analogy to Kleinberg, we show that our similarity scores are given by the components of a dominant eigenvector of a non-negative matrix. Potential applications of our similarity concept are numerous. We illustrate an application for the automatic extraction of synonyms in a monolingual dictionary.
We apply the method to the Citric Acid Cycle and the Glycolysis pathways of different groups of organisms, as well as to the Carbohydrate metabolic networks. Phylogenetic trees obtained from the experiments were close to existing phylogenies and revealed interesting relationships among organisms.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.