Online scheduling plays a key role for big data streaming applications in a big data stream computing environment, as the arrival rate of high velocity continuous data stream might fluctuate over time. In this paper, an elastic online scheduling framework for big data streaming applications (E-Stream) is proposed, exhibiting the following features. (1) Profile mathematical relationships between system response time, multiple application fairness, and online features of high velocity continuous stream. (2) Scale out or scale in a data stream graph by quantifying computation and communication cost, and the vertex semantics for arrival rate of data stream, and adjust the degree of parallelism of vertices in the graph. Sub-graph is further constructed to minimize data dependencies among the sub-graphs. (3) Elastically schedule a graph by a priority based earliest finish time first online scheduling strategy, and schedule multiple graphs by a max-min fairness strategy. (4) Evaluate the low system response time and acceptable applications fairness objectives in a realworld big data stream computing environment. Experimental results conclusively demonstrate that the proposed E-Stream provides better system response time and applications fairness compared to the existing Storm framework.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.