The projection problem (conceptual graph projection, homomorphism, injective morphism, θ-subsumption, OI-subsumption) is crucial to the efficiency of relational learning systems. How to manage this complexity has motivated numerous studies on learning biases, restricting the size and/or the number of hypotheses explored. The approach suggested in this paper advocates a projection operator based on the classical arc consistency algorithm used in constraint satisfaction problems. This projection method has the required properties : polynomiality, local validation, parallelization, structural interpretation. Using the arc consistency projection, we found a generalization operator between labeled graphs. Such an operator gives the structure of the classification space which is a concept lattice.
The paper presents a new projection operator for graphs, named AC-projection, which exhibits good complexity properties as opposed to the graph isomorphism (Θ-subsumption) operator typically used in graph mining. We study the size of the search space and some practical properties of the projection operator. These properties give us a specialization algorithm using simple local operations. Then we prove experimentally that we can achieve an important performance gain (polynomial complexity projection) without or with non-significant loss of discovered patterns quality.
Developing algorithms that discover all frequently occurring subgraphs in a large graph database is computationally extensive, as graph and subgraph isomorphisms play a key role throughout the computations. Since subgraph isomorphism testing is a hard problem, fragment miners are exponential in runtime. To alleviate the complexity issue, we propose to introduce a bias in the projection operator and instead of using the costly subgraph isomorphism projection, one can use a polynomial projection having a semantically valid structural interpretation. In this paper, our purpose is to present LC-mine, a generic and efficient framework to mine frequent subgraphs by the means of local consistency techniques used in the constraint programming field. Two instances of the framework based on the arc consistency technique are developed and presented in this paper. The first instance follows a breadthfirst order, while the second is a pattern-growth approach that follows a depth-first search space exploration strategy. Then, we prove experimentally that we can achieve an important performance gain without or with nonsignificant loss of discovered patterns in terms of quality.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.