Given a knowledge graph and a fact (a triple statement), fact checking is to decide whether the fact belongs to the missing part of the graph. Facts in real-world knowledge bases are typically interpreted by both topological and semantic context that is not fully exploited by existing methods. This paper introduces a novel fact checking method that explicitly exploits discriminant subgraph structures. Our method discovers discriminant subgraphs associated with a set of training facts, characterized by a class of graph fact checking rules. These rules incorporate expressive subgraph patterns to jointly describe both topological and ontological constraints. (1) We extend graph fact checking rules () to a class of ontological graph fact checking rules (). generalize by incorporating both topological constraints and ontological closeness to best distinguish between true and false fact statements. We provide quality measures to characterize useful patterns that are both discriminant and diversified. (2) Despite the increased expressiveness, we show that it is feasible to discover in large graphs with ontologies, by developing a supervised pattern discovery algorithm. To find useful as early as possible, it generates subgraph patterns relevant to training facts and dynamically selects patterns from a pattern stream with a small update cost per pattern. We verify that can be used as rules and provide useful features for other statistical learning-based fact checking models. Using real-world knowledge bases, we experimentally verify the efficiency and the effectiveness of-based techniques for fact checking.
This article presents a new framework that incorporates graph patterns to support fact checking in knowledge graphs. Our method discovers discriminant graph patterns to construct classifiers for fact prediction. First, we propose a class of graph fact checking rules (GFCs). A GFC incorporates graph patterns that best distinguish true and false facts of generalized fact statements. We provide statistical measures to characterize useful patterns that are both discriminant and diversified. Second, we show that it is feasible to discover GFCs in large graphs with optimality guarantees. We develop an algorithm that performs localized search to generate a stream of graph patterns, and dynamically assemble the best GFCs from multiple GFC sets, where each set ensures quality scores within certain ranges. The algorithm guarantees a ( 1 2 − ϵ ) approximation when it (early) terminates. We also develop a space-efficient alternative that dynamically spawns prioritized patterns with best marginal gains to the verified GFCs. It guarantees a (1 − 1 e ) approximation. Both strategies guarantee a bounded time cost independent of the size of the underlying graph. Third, to support fact checking, we develop two classifiers, which make use of top-ranked GFCs as predictive rules or instance-level features of the pattern matches induced by GFCs, respectively. Using real-world data, we experimentally verify the efficiency and the effectiveness of GFC-based techniques for fact checking in knowledge graphs and verify its application in knowledge exploration and news prediction.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.