Anisoara Nica scite author profile

One fundamental challenge in data stream processing is to cope with the ubiquity of disorder of tuples within a stream caused by network latency, operator parallelization, merging of asynchronous streams, etc. High result accuracy and low result latency are two conflicting goals in out-of-order stream processing. Different applications may prefer different extent of trade-offs between the two goals. However, existing disorder handling solutions either try to meet one goal to the extreme by sacrificing the other, or try to meet both goals but have shortcomings including unguaranteed result accuracy or increased complexity in operator implementation and application logic.To meet different application requirements on the latency versus result accuracy trade-off in out-of-order stream processing, in this paper, we propose to make this trade-off user-configurable. Particularly, focusing on sliding window aggregates, we introduce AQ-K-slack, a buffer-based qualitydriven disorder handling approach. AQ-K-slack leverages techniques from the fields of sampling-based approximate query processing and control theory. It can adjust the input buffer size dynamically to minimize the result latency, while respecting user-specified threshold on relative errors in produced query results. AQ-K-slack requires no a priori knowledge of disorder characteristics of data streams, and imposes no changes to the query operator implementation or the application logic. Experiments over real-world out-of-order data streams show that, compared to the stateof-art, AQ-K-slack can reduce the average buffer size, thus the average result latency, by at least 51% while respecting user-specified requirement on the accuracy of query results.

show abstract

Quality-Driven Continuous Query Execution over Out-of-Order Data Streams

Ji¹,

Zhou²,

Jerzak³

et al. 2015

View full text Add to dashboard Cite

Executing continuous queries over out-of-order data streams, where tuples are not ordered according to timestamps, is challenging; because high result accuracy and low result latency are two conflicting performance metrics. Although many applications allow trading exact query results for lower latency, they still expect the produced results to meet a certain quality requirement. However, none of existing disorder handling approaches have considered minimizing the result latency while meeting user-specified requirements on the quality of query results.In this demonstration, we showcase AQ-K-slack, an adaptive, buffer-based disorder handling approach, which supports executing sliding window aggregate queries over outof-order data streams in a quality-driven manner. By adapting techniques from the field of sampling-based approximate query processing and control theory, AQ-K-slack dynamically adjusts the input buffer size at query runtime to minimize the result latency, while respecting a user-specified threshold on relative errors in produced query results.We demonstrate a prototype stream processing system, which extends SAP Event Stream Processor with the implementation of AQ-K-slack. Through an interactive interface, the audience will learn the effect of different factors, such as the aggregate function, the window specification, the result error threshold, and stream properties, on the latency and the accuracy of query results. Moreover, they can experience the effectiveness of AQ-K-slack in obtaining user-desired latency vs. result accuracy trade-offs, compared to naive disorder handling approaches that make extreme trade-offs. For instance, by scarifying 1% result accuracy, our system can reduce the result latency by 80% when compared to the state of the art.

show abstract

The CVS algorithm for view synchronization in evolvable large-scale information systems

Nica

Lee

Rundensteiner

1998

View full text Add to dashboard Cite

Abstract. Current view technology supports only static views in the sense that views become unde ned and hence obsolete as soon as the underlying information sources ISs undergo capability c hanges. We propose to address this new view evolution problem -which w e call view synchronization -b y a n o vel solution approach that allows a ected view de nitions to be dynamically evolved to keep them in synch with evolving ISs. We present in this paper a general strategy for the view synchronization process that guided by constraints imposed by the view evolution preferences embedded in the view de nition achieves view preservation i.e., view rede nition. We present the formal correctness, the CVS algorithm, as well as numerous examples to demonstrate the main concepts.

show abstract

Exploiting ordered dictionaries to efficiently construct histograms with q-error guarantees in SAP HANA

Moerkotte

DeHaan²,

May

et al. 2014

View full text Add to dashboard Cite

Histograms that guarantee a maximum multiplicative error (q-error) for estimates may significantly improve the plan quality of query optimizers. However, the construction time for histograms with maximum q-error was too high for practical use cases. In this paper we extend this concept with a threshold, i.e., an estimate or true cardinality θ, below which we do not care about the q-error because we still expect optimal plans. This allows us to develop far more efficient construction algorithms for histograms with bounded error. The test for θ,q-acceptability developed also exploits the order-preserving dictionary encoding of SAP HANA. We have integrated this family of histograms into SAP HANA, and we report on the construction time, histograms size, and estimation errors on real-world data sets. In virtually all cases the histograms can be constructed in far less than one second, requiring less than 5% of space compared to the original compressed data.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Anisoara Nica

The EVE approach: view synchronization in dynamic distributed environments

Quality-driven processing of sliding window aggregates over out-of-order data streams

Quality-Driven Continuous Query Execution over Out-of-Order Data Streams

The CVS algorithm for view synchronization in evolvable large-scale information systems

Exploiting ordered dictionaries to efficiently construct histograms with q-error guarantees in SAP HANA

Contact Info

Product

Resources

About