Recently, Linked Open Data has become a large set of knowledge bases. Therefore, the need to query Linked Data using question answering (QA) techniques has attracted the attention of many researchers. A QA system translates natural language questions into structured queries, such as SPARQL queries, to be executed over Linked Data. The two main challenges in such systems are lexical and semantic gaps. A lexical gap refers to the difference between the vocabularies used in an input question and those used in the knowledge base. A semantic gap refers to the difference between expressed information needs and the representation of the knowledge base. In this paper, we present a novel method using an ontology lexicon and dependency parse trees to overcome lexical and semantic gaps. The proposed technique is evaluated on the QALD‐5 benchmark and exhibits promising results.
The aim of this paper is to propose an efficient method for identification of web document topics which is often considered as one of the debatable challenges in many information retrieval systems. Most of the previous works have focused on analyzing the entire text using time-consuming methods and also many of them have used unsupervised approaches to identify the main topic of documents. However, in this paper, it is attempted to exploit the most widely-used Hyper-Text Markup Language (HTML) features to extract topics from web documents using a supervised approach. Hiring an interactive crawler, we firstly try to analyze HTML structures of 5000 webpages in order to identify the most widely-used HTML features. In the next step, the selected features of 1500 webpages are extracted using the same crawler. Suitable topics are given to each web document by users in a supervised learning process. A topic modeling technique is used over extracted features to build four classifiers-C4.5, Decision Tree, Naï ve Bayes and Maximum Entropy-which are separately adopted to train and test our data. The results of classifiers are compared and the high accurate classifier is selected. In order to examine our approach in a larger scale, a new set of 3500 web documents is evaluated using the selected classifier. Results show that the proposed system provides remarkable performance which is able to obtain 71.8% recognition rate.
Due to modeling errors in designing ontologies, an ontology may carry incorrect information. Ontology debugging can be helpful in detecting errors in ontologies that are increasing in size and expressiveness day by day. While current ontology debugging methods can detect logical errors (incoherences and inconsistencies), they are incapable of detecting hidden modeling errors in coherent and consistent ontologies. From the logical perspective, there are no errors in such ontologies, but this study shows some modeling errors may not break the coherency of the ontology by not participating in any contradiction. In this paper, contextual knowledge is exploited to detect such hidden errors. Our experiments show that adding general ontologies like DBpedia as contextual knowledge in the ontology debugging process results in detecting contradictions in ontologies that are coherent.
Most of the ontology alignment tools use terminological techniques as the initial step and then apply the structural techniques to refine the results. Since each terminological similarity measure considers some features of similarity, ontology alignment systems require exploiting different measures. While a great deal of effort has been devoted to developing various terminological similarity measures and also developing various ontology alignment systems, little attention has been paid to develop similarity search algorithms which exploit different similarity measures in order to gain benefits and avoid limitations. We propose a novel terminological search algorithm which tries to find an entity similar to an input search string in a given ontology. This algorithm extends the search string by creating a matrix from its synonym and hypernyms. The algorithm employs and combines different kind of similarity measures in different situations to achieve a higher performance, accuracy, and stability in comparison with previous methods which either use one measure or combine more measures in a naive ways such as averaging. We evaluated the algorithm using a subset of OAEI Bench mark data set. Results showed the superiority of proposed algorithm and effectiveness of different applied techniques such as word sense disambiguation and semantic filtering mechanism.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.