In the densest subgraph problem, given an edge-weighted undirected graph G = (V, E, w), we are asked to find S ⊆ V that maximizes the density, i.e., w(S)/|S|, where w(S) is the sum of weights of the edges in the subgraph induced by S. This problem has often been employed in a wide variety of graph mining applications. However, the problem has a drawback; it may happen that the obtained subset is too large or too small in comparison with the size desired in the application at hand. In this study, we address the size issue of the densest subgraph problem by generalizing the density of S ⊆ V . Specifically, we introduce the f -density of S ⊆ V , which is defined as w(S)/f (|S|), where f : Z ≥0 → R ≥0 is a monotonically non-decreasing function. In the f -densest subgraph problem (f -DS), we aim to find S ⊆ V that maximizes the f -density w(S)/f (|S|). Although f -DS does not explicitly specify the size of the output subset of vertices, we can handle the above size issue using a convex /concave size function f appropriately. For f -DS with convex function f , we propose a nearly-linear-time algorithm with a provable approximation guarantee. On the other hand, for f -DS with concave function f , we propose an LP-based exact algorithm, a flow-based O(|V | 3 )-time exact algorithm for unweighted graphs, and a nearly-linear-time approximation algorithm.
Cheeger's inequality states that a tightly connected subset can be extracted from a graph G using an eigenvector of the normalized Laplacian associated with G. More specifically, we can compute a subset with conductance O( √ φ G ), where φ G is the minimum conductance of a set in G.It has recently been shown that Cheeger's inequality can be extended to hypergraphs. However, as the normalized Laplacian of a hypergraph is no longer a matrix, we can only approximate to its eigenvectors; this causes a loss in the conductance of the obtained subset. To address this problem, we here consider the heat equation on hypergraphs, which is a differential equation exploiting the normalized Laplacian. We show that the heat equation has a unique solution and that we can extract a subset with conductance √ φ G from the solution. An analogous result also holds for directed graphs.
Identifying community structure in networks is an issue of particular interest in network science. The modularity introduced by Newman and Girvan is the most popular quality function for community detection in networks. In this study, we identify a problem in the concept of modularity and suggest a solution to overcome this problem. Specifically, we obtain a new quality function for community detection. We refer to the function as Z-modularity because it measures the Z-score of a given partition with respect to the fraction of the number of edges within communities. Our theoretical analysis shows that Z-modularity mitigates the resolution limit of the original modularity in certain cases. Computational experiments using both artificial networks and well-known real-world networks demonstrate the validity and reliability of the proposed quality function.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.