International audienceAttributed directed graphs are directed graphs in which nodes are associated with sets of attributes. Many data from the real world can be naturally represented by this type of structure, but few algorithms are able to directly handle these complex graphs. Mining attributed graphs is a difficult task because it requires combining the exploration of the graph structure with the identification of frequent itemsets. In addition, due to the combinatorics on itemsets, subgraph isomorphisms (which have a significant impact on performances) are much more numerous than in labeled graphs. In this paper, we present a new data mining method that can extract frequent patterns from one or more directed attributed graphs. We show how to reduce the combinatorial explosion induced by subgraph isomorphisms thanks to an appropriate processing of automorphic patterns
Directed Acyclic Graphs (DAGs) are used in many domains ranging from computer science to bioinformatics, including industry and geoscience. They enable to model complex evolutions where spatial objects (e.g., soil erosion) may move, (dis)appear, merge or split. We study a new graph-based representation, called attributed DAG (a-DAG). It enables to capture interactions between objects as well as information on objects (e.g., characteristics or events). In this paper, we focus on pattern mining in such data. Our patterns, called weighted paths, offer a good trade-off between expressiveness and complexity. Frequency and compactness constraints are used to filter out uninteresting patterns. These constraints lead to an exact condensed representation (without loss of information) in the single-graph setting. A depth-first search strategy and an optimized data structure are proposed to achieve the efficiency of weighted path discovery. It does a progressive extension of patterns based on database projections. Relevance, scalability and genericity are illustrated by means of qualitative and quantitative results when mining various real and synthetic datasets. In particular, we show how such an approach can be used to monitor soil erosion using remote sensing and Geographical Information System (GIS) data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.