Detecting anomalies in data is a vital task, with numerous high-impact applications in areas such as security, finance, health care, and law enforcement. While numerous techniques have been developed in past years for spotting outliers and anomalies in unstructured collections of multi-dimensional points, with graph data becoming ubiquitous, techniques for structured graph data have been of focus recently. As objects in graphs have long-range correlations, a suite of novel technology has been developed for anomaly detection in graph data.This survey aims to provide a general, comprehensive, and structured overview of the state-of-the-art methods for anomaly detection in data represented as graphs. As a key contribution, we give a general framework for the algorithms categorized under various settings: unsupervised vs. (semi-)supervised approaches, for static vs. dynamic graphs, for attributed vs. plain graphs. We highlight the effectiveness, scalability, generality, and robustness aspects of the methods. What is more, we stress the importance of anomaly attribution and highlight the major techniques that facilitate digging out the root cause, or the 'why', of the detected anomalies for further analysis and sense-making. Finally, we present several real-world applications of graph-based anomaly detection in diverse domains, including financial, auction, computer traffic, and social networks. We conclude our survey with a discussion on open theoretical and practical challenges in the field.
Abstract-Given a graph, how can we automatically discover roles for nodes? Roles could be, eg., 'bridges', or 'peripherynodes', etc. Roles are compact summaries of a node's behavior that generalize across networks. They enable numerous novel and useful network mining tasks, such as sense-making, searching for similar nodes, and node classification. We propose RolX (Role eXtraction), a scalable (linear in the number of edges), unsupervised learning approach for automatically extracting roles from general network data. We demonstrate the effectiveness of RolX on several network mining tasks, from exploratory data analysis to network transfer learning. Moreover, we compare network role discovery with network community discovery. We highlight fundamental differences between the two (e.g., roles generalize across disconnected networks, communities do not).
How much did a network change since yesterday? How different is the wiring between Bob's brain (a left-handed male) and Alice's brain (a right-handed female)? Graph similarity with known node correspondence, i.e. the detection of changes in the connectivity of graphs, arises in numerous settings. In this work, we formally state the axioms and desired properties of the graph similarity functions, and evaluate when state-of-the-art methods fail to detect crucial connectivity changes in graphs. We propose DELTACON, a principled, intuitive, and scalable algorithm that assesses the similarity between two graphs on the same nodes (e.g. employees of a company, customers of a mobile carrier). Experiments on various synthetic and real graphs showcase the advantages of our method over existing similarity measures. Finally, we employ DELTACON to real applications: (a) we classify people to groups of high and low creativity based on their brain connectivity graphs, and (b) do temporal anomaly detection in the who-emails-whom Enron graph.
Anomaly detection is an important problem with multiple applications, and thus has been studied for decades in various research domains. In the past decade there has been a growing interest in anomaly detection in data represented as networks, or graphs, largely because of their robust expressiveness and their natural ability to represent complex relationships. Originally, techniques focused on anomaly detection in static graphs, which do not change and are capable of representing only a single snapshot of data. As real-world networks are constantly changing, there has been a shift in focus to dynamic graphs, which evolve over time.In this survey, we aim to provide a comprehensive overview of anomaly detection in dynamic networks, concentrating on the state-of-the-art methods. We first describe four types of anomalies that arise in dynamic networks, providing an intuitive explanation, applications, and a concrete example for each. Having established an idea for what constitutes an anomaly, a general two-stage approach to anomaly detection in dynamic networks that is common among the methods is presented. We then construct a two-tiered taxonomy, first partitioning the methods based on the intuition behind their approach, and subsequently subdividing them based on the types of anomalies they detect. Within each of the tier one categories-community, compression, decomposition, distance, and probabilistic model based-we highlight the major similarities and differences, showing the wealth of techniques derived from similar conceptual approaches.
While advances in computing resources have made processing enormous amounts of data possible, human ability to identify patterns in such data has not scaled accordingly. Efficient computational methods for condensing and simplifying data are thus becoming vital for extracting actionable insights. In particular, while data summarization techniques have been studied extensively, only recently has summarizing interconnected data, or graphs, become popular. This survey is a structured, comprehensive overview of the state-of-the-art methods for summarizing graph data. We first broach the motivation behind, and the challenges of, graph summarization. We then categorize summarization approaches by the type of graphs taken as input and further organize each category by core methodology. Finally, we discuss applications of summarization on real-world graphs and conclude by describing some open problems in the field.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.