The problem of integrating heterogeneous data sources into an ontology is highly relevant in the database field. Several techniques exist to approach the problem, but side constraints on the data cannot be easily implemented and thus the results may be inconsistent. In this paper we improve previous work by Taheriyan et al.[2016a] using Machine Learning (ML) to take into account inconsistencies in the data (unmatchable attributes) and encode the problem as a variation of the Steiner Tree, for which we use work by De Uña et al. [2016] in Constraint Programming (CP). Combining ML and CP achieves state-of-the-art precision, recall and speed, and provides a more flexible framework for variations of the problem.
The Steiner Tree Problem is a well know NP-complete problem that is well studied and for which fast algorithms are already available. Nonetheless, in the real world the Steiner Tree Problem is almost always accompanied by side constraints which means these approaches cannot be applied. For many problems with side constraints, only approximation algorithms are known. We introduce here a propagator for the tree constraint with explanations, as well as lower bounding techniques and a novel constraint programming approach for the Steiner Tree Problem and two of its variants. We find our propagators with explanations are highly advantageous when it comes to solving variants of this problem.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.