Network connectivity optimization, which aims to manipulate network connectivity by changing its underlying topology, is a fundamental task behind a wealth of high-impact data mining applications, ranging from immunization, critical infrastructure construction, social collaboration mining, bioinformatics analysis, to intelligent transportation system design. To tackle its exponential computation complexity, greedy algorithms have been extensively used for network connectivity optimization by exploiting its diminishing returns property. Despite the empirical success, two key challenges largely remain open. First, on the theoretic side, the hardness, as well as the approximability of the general network connectivity optimization problem are still nascent except for a few special instances. Second, on the algorithmic side, current algorithms are often hard to balance between the optimization quality and the computational efficiency. In this paper, we systematically address these two challenges for the network connectivity optimization problem. First, we reveal some fundamental limits by proving that, for a wide range of network connectivity optimization problems, (1) they are NP-hard and (2) (1 − 1/e) is the optimal approximation ratio for any polynomial algorithms. Second, we propose an effective, scalable and general algorithm (CONTAIN) to carefully balance the optimization quality and the computational efficiency.where π is a subgraph of G, f is a non-negative function that maps any subgraph in G to a non-negative real number (i.e. f : π → R + ) [7]. Specifically, we have f (ϕ) = 0 for empty set ϕ; when f (π ) > 0, we call subgraph π as a valid subgraph. In other words, the network connectivity C(G) can be viewed as a weighted aggregation of the connectivities of all valid subgraphs in the network.By choosing an appropriate f () function (please refer to [7] for details), Eq. (1) includes several prevalent network connectivity measures, e.g., path capacity (which is in close relation to the epidemic threshold), triangle capacity (which is rooted in social balance theory) and natural connectivity (which is closely related to network robustness). In terms of computation, it is often much more efficient to either approximate or compute these connectivity measures by the associated eigen-function F (Λ (r ) ), where Λ (r ) represents the top-r eigenvalues of A. For example, the path capacity converges to the leading eigenvalue of the adjacency matrix of the network [4], the triangle capacity can be approximated by the sum of cubes of the eigenvalues [33], and the natural connectivity is calculated by the sum of exponentials of the eigenvalues [16].
Network Connectivity OptimizationWith the network connectivity measure in Eq. (1), we formally define network connectivity optimization problem as follows.Problem 1. Network Connectivity Optimization (NETCOP) Given: (1) a network G; (2) a connectivity mapping function f : π → R + which defines C(G); (3) a type of network operation (node deletion vs. edge deletion) and (4) a...