2015 IEEE International Conference on Big Data (Big Data) 2015
DOI: 10.1109/bigdata.2015.7363749
|View full text |Cite
|
Sign up to set email alerts
|

Workload scheduling in distributed stream processors using graph partitioning

Abstract: With ever increasing data volumes, large compute clusters that process data in a distributed manner have become prevalent in industry. For distributed stream processing platforms (such as Storm) the question of how to distribute workload to available machines, has important implications for the overall performance of the system. We present a workload scheduling strategy that is based on a graph partitioning algorithm. The scheduler is application agnostic: it collects the communication behavior of running appl… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
25
0

Year Published

2016
2016
2024
2024

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 29 publications
(25 citation statements)
references
References 20 publications
0
25
0
Order By: Relevance
“…On-demand computing resource allocation is the main target accomplished by adjusting the task schedule of the edge gateway via a lightweight virtualization technology (i.e., Docker). The authors of [21] trying to respond to question of how to distribute workload to available machines propose a workload scheduling strategy that is based on a graph partitioning algorithm. The proposed scheduler is application agnostic and builds on the data related to the communication behavior of running applications.…”
Section: Related Workmentioning
confidence: 99%
“…On-demand computing resource allocation is the main target accomplished by adjusting the task schedule of the edge gateway via a lightweight virtualization technology (i.e., Docker). The authors of [21] trying to respond to question of how to distribute workload to available machines propose a workload scheduling strategy that is based on a graph partitioning algorithm. The proposed scheduler is application agnostic and builds on the data related to the communication behavior of running applications.…”
Section: Related Workmentioning
confidence: 99%
“…Acking is done for all events, and the checkpoint interval is periodic (30 secs, by default) and has to be configured to balance operational costs and rollback loss for a dataflow. Hence, they also pose additional overheads if the fault-tolerance is a concern only during active migration and not during regular operations [8], [9]. This can be punitive during normal operations if the input rates are high [10].…”
Section: Background and Motivationmentioning
confidence: 99%
“…Fisher et al [11] solve the scheduling problem using graph partitioning. POIs are vertices of the graph, and are weighted by the computational resources they consume.…”
Section: Operator Instance Schedulingmentioning
confidence: 99%
“…Any online scheduler that actively measures communication between POIs can then notice the improvement and re-visit the POI placement decision, leading to even better performance. Our approach is similar to [11] as it relies on Metis for graph partitioning. Instead of considering a graph of POIs communicating, we consider a graph of keys that cooccur in the data.…”
Section: Operator Instance Schedulingmentioning
confidence: 99%