Keyword search over relational streams is useful when allowing users to query on streams without understanding the details about the streams and query language as well. There have been several research works on this direction, and the state-of-the-art approaches exploit Candidate Networks (CNs), which are schema-level descriptions of possible joining networks of tuples, and generate query plans based on CNs. However, in fact, the performance of these approaches seriously degrades in particular when the maximum size of CNs (T max ) and/or the number of query keywords are large due to the explosive increase in the number of CNs. To cope with this problem, we propose a novel query plan called MX-structure to consolidate CNs as much as possible. We suppress explosive blowup of nodes in query plans by consolidating all common edges among CNs. The experimental results prove that the proposed algorithm performs much better than the state-of-the-art approaches.
Stream processing is used in various fields. In the field of big data, stream aggregation is a popular processing technique, but it suffers serious setbacks when the order of events (e.g., stream elements) occurring is different from the order of events arriving to the systems. Such data streams are called "non-FIFO steams". This phenomenon usually occurs in a distributed environment due to many factors, such as network disruptions, delays, etc. Many analyzing scenarios require efficient processing of such non-FIFO streams to meet various data processing requirements. This paper proposes an efficient scalable checkpoint-based bidirectional indexing approach, called CP iX, for faster real-time analysis over non-FIFO streams. CPiX maintains the partial aggregation results in an on-demand manner per checkpoint. CPiX needs less time and space than the state-of-the-art approach. Extensive experiments confirm that CPiX can deal with out-of-order streams very efficiently and is, on average, about 3.8 times faster than the state-of-the-art approach while consuming less memory.
Purpose – In purpose of this paper is to propose a novel scheme to process XPath-based keyword search over Extensible Markup Language (XML) streams, where one can specify query keywords and XPath-based filtering conditions at the same time. Experimental results prove that our proposed scheme can efficiently and practically process XPath-based keyword search over XML streams. Design/methodology/approach – To allow XPath-based keyword search over XML streams, it was attempted to integrate YFilter (Diao et al., 2003) with CKStream (Hummel et al., 2011). More precisely, the nondeterministic finite automation (NFA) of YFilter is extended so that keyword matching at text nodes is supported. Next, the stack data structure is modified by integrating set of NFA states in YFilter with bitmaps generated from set of keyword queries in CKStream. Findings – Extensive experiments were conducted using both synthetic and real data set to show the effectiveness of the proposed method. The experimental results showed that the accuracy of the proposed method was better than the baseline method (CKStream), while it consumed less memory. Moreover, the proposed scheme showed good scalability with respect to the number of queries. Originality/value – Due to the rapid diffusion of XML streams, the demand for querying such information is also growing. In such a situation, the ability to query by combining XPath and keyword search is important, because it is easy to use, but powerful means to query XML streams. However, none of existing works has addressed this issue. This work is to cope with this problem by combining an existing XPath-based YFilter and a keyword-search-based CKStream for XML streams to enable XPath-based keyword search.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.