Expressive Description Logics (DLs) have been advocated as formalisms for modeling the domain of interest in various application areas. An important requirement is the ability to answer complex queries beyond instance retrieval, taking into account constraints expressed in a knowledge base. We consider this task for positive existential path queries (which generalize conjunctive queries and unions thereof), whose atoms are regular expressions over the roles (and concepts) of a knowledge base in the expressive DL ALCQIbreg . Using techniques based on two-way tree-automata, we first provide an elegant characterization of TBox and ABox reasoning, which gives us also a tight EXPTIME bound. We then prove decidability (more precisely, a 2EXPTIME upper bound) of query answering, thus significantly pushing the decidability frontier, both with respect to the query language and the considered DL. We also show that query answering is EXP-SPACE-hard already in rather restricted settings.
The logical foundations of the standard web ontology languages are provided by expressive Description Logics (DLs), such as SHIQ and SHOIQ. In the Semantic Web and other domains, ontologies are increasingly seen also as a mechanism to access and query data repositories. This novel context poses an original combination of challenges that has not been addressed before: (i) sufficient expressive power of the DL to capture common data modelling constructs; (ii) well established and flexible query mechanisms such as those inspired by database technology; (iii) optimisation of inference techniques with respect to data size, which typically dominates the size of ontologies. This calls for investigating data complexity of query answering in expressive DLs. While the complexity of DLs has been studied extensively, few tight characterisations of data complexity were available, and the problem was still open for most DLs of the SH family and for standard query languages like conjunctive queries and their extensions. We tackle this issue and prove a tight coNP upper bound for positive existential queries without transitive roles in SHOQ, SHIQ, and SHOI. We thus establish that, for a whole range of sublogics of SHOIQ that contain AL, answering such queries has coNP-complete data complexity. We obtain our result by a novel tableaux-based algorithm for checking query entailment, which uses a modified blocking condition in the style of Carin. The algorithm is sound for SHOIQ, and shown to be complete for all considered proper sublogics in the SH family.
Conjunctive regular path queries are an expressive extension of the well-known class of conjunctive queries. Such queries have been extensively studied in the (graph) database community, since they support a controlled form of recursion and enable sophisticated path navigation. Somewhat surprisingly, there has been little work aimed at using such queries in the context of description logic (DL) knowledge bases, particularly for the lightweight DLs that are considered best suited for data-intensive applications. This paper aims to bridge this gap by providing algorithms and tight complexity bounds for answering two-way conjunctive regular path queries over DL knowledge bases formulated in lightweight DLs of the DL-Lite and EL families. Our results demonstrate that in data complexity, the cost of moving to this richer query language is as low as one could wish for: the problem is NL-complete for DL-Lite and P-complete for EL. The combined complexity of query answering increases from NP- to PSpace-complete, but for two-way regular path queries (without conjunction), we show that query answering is tractable even with respect to combined complexity. Our results reveal two-way conjunctive regular path queries as a promising language for querying data enriched by ontologies formulated in DLs of the DL-Lite and EL families or the corresponding OWL 2 QL and EL profiles.
Research directions for Principles of Data ManagementPDM played a foundational role in the relational database model, with the robust connection between algebraic and calculus-based query languages, the connection between integrity constraints and database design, key insights for the field of query optimization, and the fundamentals of consistent concurrent transactions. This early work included rich cross-fertilization between PDM and other disciplines in mathematics and computer science, including logic, complexity theory, and knowledge representation. Since the 1990s we have seen an overwhelming increase in both the production of data and the ability to store and access such data. This has led to a phenomenal metamorphosis in the ways that we manage and use data. During this time, we have gone (1) from stand-alone disk-based databases to data that is spread across and linked by the Web, (2) from rigidly structured towards loosely structured data, and (3) from relational data to many different data models (hierarchical, graph-structured, data points, NoSQL, text data, image data, etc.). Research on PDM has developed during that time, too, following, accompanying and influencing this process. It has intensified research on extensions of the relational model (data exchange, incomplete data, probabilistic data, . . . ), on other data models (hierachical, semi-structured, graph, text, . . . ), and on a variety of further data management areas, including knowledge representation and the semantic web, data privacy and security, and data-aware (business) processes. Along the way, the PDM community expanded its cross-fertilization with related areas, to include automata theory, web services, parallel computation, document processing, data structures, scientific workflow, business process management, data-centered dynamic systems, data mining, machine learning, information extraction, etc.Looking forward, three broad areas of data management stand out where principled, mathematical thinking can bring new approaches and much-needed clarity. The first relates to the full lifecycle of so-called "Big Data Analytics", that is, the application of statistical and machine learning techniques to make sense out of, and derive value from, massive volumes of data. The second stems from new forms of data creation and processing, especially as it arises in applications such as web-based commerce, social media applications, and dataaware workflow and business process management. The third, which is just beginning to emerge, is the development of new principles and approaches in support of ethical data management. We briefly illustrate some of the primary ways that these three areas can be supported by the seven PDM research themes that are explored in this report.The overall lifecycle of Big Data Analytics raises a wealth of challenge areas that PDM can help with. As documented in numerous sources, so-called "data wrangling" can form 50% to 80% of the labor costs in an analytics investigation. The challenges of data wrangling can be ...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.