We introduce a new framework to evaluate and improve first-order (FO) ontologies using automated theorem provers (ATPs) on the basis of competency questions (CQs). Our framework includes both the adaptation of a methodology for evaluating ontologies to the framework of first-order logic and a new set of non-trivial CQs designed to evaluate FO versions of SUMO, which significantly extends the very small set of CQs proposed in the literature. Most of these new CQs have been automatically generated from a small set of patterns and the mapping of WordNet to SUMO. Applying our framework, we demonstrate that Adimen-SUMO v2.2 outperforms TPTP-SUMO. In addition, using the feedback provided by ATPs we have set an improved version of Adimen-SUMO (v2.4). This new version outperforms the previous ones in terms of competency. For instance, "Humans can reason" is automatically inferred from Adimen-SUMO v2.4, while it is neither deducible from TPTP-SUMO nor Adimen-SUMO v2.2.Comment: 8 pages, 2 table
Formal ontologies are axiomatizations in a logic-based formalism. The development of formal ontologies is generating considerable research on the use of automated reasoning techniques and tools that help in ontology engineering. One of the main aims is to refine and to improve axiomatizations for enabling automated reasoning tools to efficiently infer reliable information. Defects in the axiomatization cannot only cause wrong inferences, but can also hinder the inference of expected information, either by increasing the computational cost of or even preventing the inference. In this paper, we introduce a novel, fully automatic white-box testing framework for first-order logic (FOL) ontologies. Our methodology is based on the detection of inference-based redundancies in the given axiomatization. The application of the proposed testing method is fully automatic since (i) the automated generation of tests is guided only by the syntax of axioms and (ii) the evaluation of tests is performed by automated theorem provers (ATPs). Our proposal enables the detection of defects and serves to certify the grade of suitability—for reasoning purposes—of every axiom. We formally define the set of tests that are (automatically) generated from any axiom and prove that every test is logically related to redundancies in the axiom from which the test has been generated. We have implemented our method and used this implementation to automatically detect several non-trivial defects that were hidden in various FOL ontologies. Throughout the paper we provide illustrative examples of these defects, explain how they were found and how each proof—given by an ATP—provides useful hints on the nature of each defect. Additionally, by correcting all the detected defects, we have obtained an improved version of one of the tested ontologies: Adimen-SUMO.
We report on the results of evaluating the performance automated theorem provers using \ADIMENSUMO{}. The evaluation follows the adaptation of the methodology based on competency questions \cite{GrF95} to the framework of first-order logic, which is presented in \cite{ALR15}, and is applied to \ADIMENSUMO{} \cite{ALR12}. The set of competency questions used for this evaluation has been semi-automatically generated from a small set of semantic patterns and the mapping of \WORDNET{} to \SUMO{}, also introduced in \cite{ALR15}. Our experimental results demonstrate that improved versions of the proposed set of competency questions could be really valuable for the development of automated theorem provers.
This paper offers a new practical approach toward automated commonsense reasoning with first-order logic (FOL) SUMO-based ontologies. We propose a new black-box evaluation framework for SUMO-based ontologies, which exploits the world knowledge encoded in WordNet and its mapping into SUMO. Our proposal consists of both a novel semi-automatic method for the creation of a large set of commonsense problems and a new procedure that enables its automatic evaluation by using automated theorem provers (ATPs). The application of our method enables the creation of a very large benchmark consisting of more than 15 000 problems from a small set of manually built question patterns that exploit the WordNet semantic relations. By means of the resulting benchmark, we successfully evaluate the competency of different translations of SUMO into FOL and the performance of various state-of-the-art FOL ATPs according to several quality criteria. A general analysis of our experimental results demonstrates that the proposed commonsense problems are heterogeneous and non-trivial. Furthermore, a fine-grained analysis of the experimental results obtained for a sample of our benchmark enables the detection of some mapping errors and some discrepancies between the knowledge of WordNet and SUMO. The evaluation benchmark and all the resources that have been used and developed during this work are released in a single package. INDEX TERMS Ontology evaluation, automated reasoning, commonsense knowledge, WordNet, SUMO.
In this paper, we present a new proposal for an efficient implementation of constructive negation. In our approach the answers for a literal are bottom-up computed by solving equality constraints, instead of by handling frontiers of subsidiary computation trees. The required equality constraints are given by Shepherdson's operators which are, in a sense, similar to bottom-up immediate consequence operators. However, in order to make the procedure efficient two main techniques are applied. First, we restrict our constraints to a class of success-answers (resp. fail-answers) which are easy to manipulate and to solve (or to prove their unsatisfiability). And, second, we take advantage of the monotonic nature of Shepherdson's operators to make the procedure incremental and to avoid recalculations that are typical in frontiers-based methods. Then, goal computation is made in the usual top-down CLP scheme of collecting the answers for the selected literal into the constraint of the goal. The procedural mechanism for constructive negation is designed not only to generate every correct answer of a goal, but also to detect failure. That is, in spite of the bottom-up nature of the calculation of literal answers, goal computation is not necessarily infinite. The operational semantics that makes use of these ideas, called BCN, is sound and complete with respect to three-valued program completion for the whole class of normal logic programs. A prototype implementation of this approach has been developed and the experimental results are very promising.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.