Reliability is of critical importance to many applications involving distributed event processing systems. Especially the use of stateful operators makes it challenging to provide efficient recovery from failures and to ensure consistent event streams. Even during failure-free execution, state-of-the-art methods for achieving reliability incur significant overhead at run-time concerning computational resources, event traffic, and event detection time. This paper proposes a novel method for rollback-recovery that allows for recovery from multiple simultaneous operator failures, but eliminates the need for persistent checkpoints. Thereby, the operator state is preserved in savepoints at points in time when its execution solely depends on the state of incoming event streams which are reproducible by predecessor operators. We propose an expressive event processing model to determine savepoints and algorithms for their coordination in a distributed operator network. Evaluations show that very low overhead at failure-free execution in comparison to other approaches is achieved.
Real-time diagnostic simulations are one challenging application domain that is expected to introduce high requirements to global sensor applications. Besides having hard constraints on latency bounds at which data needs to be processed, such simulation applications will impose high requirements with respect to available bandwidth. Predictors, originally introduced in the domain of wireless sensor networks for energy saving, are one appealing solution to provide real-time estimates and at the same time significantly reduce the data rates. While in the setting of wireless sensor networks many prediction models have been analyzed, their behavior and use is unclear when applied to distributed data streams where aggregation results are typically processed over multilevel hierarchies.In the context of weather simulations, we propose a distributed R-Tree-based aggregation algorithm that allows for efficient reuse of aggregate queries. In the setting of real temperature readings taken from weather stations during one month, we study the trade-off between updates of the prediction model and the precision of the predicted values. Our evaluations indicate that even in situations where complex prediction models are expected to perform best, simple prediction models give higher benefits with respect to saving bandwidth while providing similar data accuracy.
Many application classes such as monitoring applications, involve processing a massive amount of data from a possibly huge number of data sources. Complex Event Processing (CEP) has evolved as the paradigm of choice to determine meaningful situations (complex events) by performing stepwise correlation over event streams. To keep up with the high scalability demands of growing input streams, recent approaches distribute event correlation over several correlation nodes. However, already a failure of a single correlation node impacts the correctness of the final correlation result. In this paper, we illustrate the importance of a strong reliability semantics for CEP in the context of a monitoring application in a distributed production environment. Strong reliability ensures each complex event is detected and delivered exactly once to each application entity, and cannot be guaranteed by the naive application of established replication principles. We present a replication scheme which ensures strong reliability in an asynchronous system model and can be applied to an arbitrary distributed CEP system. The algorithm tolerates f simultaneous failures by introducing f additional replicas for each correlation node. We prove correctness as well as evaluate the overhead introduced by the algorithm. Results show, that the overhead scales linearly with the number of deployed replicas and the node failure rate.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.