2017 IEEE 7th International Advance Computing Conference (IACC) 2017
DOI: 10.1109/iacc.2017.0043
|View full text |Cite
|
Sign up to set email alerts
|

Fault-Tolerant Scheduling for Scientific Workflows in Cloud Environments

Abstract: Abstract-Executing clustered tasks has proven to be an efficient method to improve the computation of Scientific Workflows (SWf) on clouds. However, clustered tasks has a higher probability of suffering from failures than a single task. Therefore, fault tolerance in cloud computing is extremely essential while running large-scale scientific applications. In this paper, a new heuristic called Cluster based Heterogeneous Earliest Finish Time (C-HEFT) algorithm to enhance the scheduling and fault tolerance mechan… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
7
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 15 publications
(7 citation statements)
references
References 13 publications
0
7
0
Order By: Relevance
“…Unfortunately, this process requires an excess of resources and thus the overall cost is significantly increased. Resubmission [17] refers to the process of executing a failed task from the start. This process may be repeated on the same computational node or another one that has been chosen.…”
Section: Related Workmentioning
confidence: 99%
“…Unfortunately, this process requires an excess of resources and thus the overall cost is significantly increased. Resubmission [17] refers to the process of executing a failed task from the start. This process may be repeated on the same computational node or another one that has been chosen.…”
Section: Related Workmentioning
confidence: 99%
“…Though the proposed elastic resource provisioning mechanism based on primary-backup model improves the resource utilization in the context of fault tolerant, the faults that occur at the physical server level cannot be handled by the proposed scheduling algorithms. Authors in [30] have proposed similar fault tolerant scientific work-flow scheduling algorithm considering the spot and on-demand instances on the cloud.…”
Section: Related Workmentioning
confidence: 99%
“…Resubmission and replication are two fundamental FT mechanisms that have been effectively applied in the cluster [13,14], grid [15][16][17], and cloud [6,18,19]. Resubmission resubmits a task after a failure happens, which can effectively reduce resource consumption.…”
Section: Introductionmentioning
confidence: 99%
“…Resubmission resubmits a task after a failure happens, which can effectively reduce resource consumption. However, it introduces a longer task execution time, resulting in hardly satisfying the deadline constraint [19]. Alternatively, replication duplicates multiple task copies and allocates them to different computational units [18].…”
Section: Introductionmentioning
confidence: 99%