Data streaming applications are an important class of dataintensive systems. Performance is an essential quality of such systems. It is, for example, expressed by the delay of analysis results or the utilization of system resources. Architecture-level decisions such as the configuration of sources, sinks and operations, their deployment or the choice of technology impact the performance. Current component-based performance prediction approaches cannot accurately predict the performance of those systems, because they do not support the metrics that are specific to data streaming applications and only approximate the behavior of data stream operations instead of expressing it explicitly. In particular, operations that group multiple data events and thus introduce timing dependencies between different calls to the system are not represented sufficiently. In this paper, we present an approach for modeling networks of data stream operations including their parameters with the goal of predicting the performance of the resulting composed data streaming application. The approach is based on a component-based performance model with queueing semantics for processing resources. Our evaluation shows that our model can more accurately express the behavior of the system, resulting in a more expressive performance model compared to a well-encapsulated component-based model without data stream operations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.