Given a network of nodes, minimizing the spread of a contagion using a limited budget is a well-studied problem with applications in network security, viral marketing, social networks, and public health. In real graphs, virus may infect a node which in turn infects its neighbour nodes and this may trigger an epidemic in the whole graph. The goal thus is to select the best k nodes (budget constraint) that are immunized (vaccinated, screened, filtered) so as the remaining graph is less prone to the epidemic. It is known that the problem is, in all practical models, computationally intractable even for moderate sized graphs. In this paper we employ ideas from spectral graph theory to define relevance and importance of nodes. Using novel graph theoretic techniques, we then design an efficient approximation algorithm to immunize the graph. Theoretical guarantees on the running time of our algorithm show that it is more efficient than any other known solution in the literature. We test the performance of our algorithm on several real world graphs. Experiments show that our algorithm scales well for large graphs and outperforms state of the art algorithms both in quality (containment of epidemic) and efficiency (runtime and space complexity).
Immunizing a subset of nodes in a network -enabling them to identify and withstand the spread of harmful content -is one of the most effective ways to counter the spread of malicious content. It has applications in network security, public health policy, and social media surveillance. Finding a subset of nodes whose immunization results in the least vulnerability of the network is a computationally challenging task. In this work, we establish a relationship between a widely used network vulnerability measure and the combinatorial properties of networks. Using this relationship and graph summarization techniques, we propose an efficient approximation algorithm to find a set of nodes to immunize. We provide theoretical justifications for the proposed solution and analytical bounds on the runtime of our algorithm. We empirically demonstrate on various real-world networks that the performance of our algorithm is an order of magnitude better than the state of the art solution. We also show that in practice the runtime of our algorithm is significantly lower than that of the best-known solution.
We argue that classic citation-based scientific document clustering approaches, like co-citation or bibliographic coupling, lack to leverage the social-usage of the scientific literature originate through online information dissemination platforms, such as Twitter. In this paper, we present the methodology tweet coupling, which measures the similarity between two or more scientific documents if one or more Twitter users mention them in the tweet(s). We evaluate our proposal on an altmetric dataset, which consists of 3,081 scientific documents and 8,299 unique Twitter users. By employing the clustering approaches of bibliographic coupling and tweet coupling, we find the relationship between the bibliographic and tweet coupled scientific documents. Further, using VOSviewer, we empirically show that tweet coupling appears to be a better clustering methodology to generate cohesive clusters since it groups similar documents from the subfields of the selected field, in contrast to the bibliographic coupling approach that groups cross-disciplinary documents in the same cluster.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.