2010
DOI: 10.1007/s11227-009-0345-y
|View full text |Cite
|
Sign up to set email alerts
|

A hybrid fault tolerance technique in grid computing system

Abstract: In order to achieve high level of reliability and availability, the grid infrastructure should be a foolproof fault tolerant. Fault tolerance plays a key role in order to assert availability and reliability of a grid system. Since the failure of resources affects job execution fatally, fault tolerance service is essential to satisfy QoS requirement in grid computing.In this paper we proposed two hybrid fault tolerance techniques (FTTs) that are called alternate task with checkpoint and alternate task with retr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
15
0

Year Published

2012
2012
2023
2023

Publication Types

Select...
7
1

Relationship

2
6

Authors

Journals

citations
Cited by 23 publications
(15 citation statements)
references
References 15 publications
0
15
0
Order By: Relevance
“…A classification of techniques has been proposed by [106] which distinguishes between two classes of failure handling techniques, namely, task level failure handling and workflow level failure handling. The recovery techniques that can be performed at the task level for masking the fault effects are called task level techniques.…”
Section: Task Resubmissionmentioning
confidence: 99%
See 1 more Smart Citation
“…A classification of techniques has been proposed by [106] which distinguishes between two classes of failure handling techniques, namely, task level failure handling and workflow level failure handling. The recovery techniques that can be performed at the task level for masking the fault effects are called task level techniques.…”
Section: Task Resubmissionmentioning
confidence: 99%
“…In other words, workflow level FTTs change the flow of execution on failure according to the knowledge of task execution context. They can also be classified into four different types: alternate task, redundancy, user defined exception handling and rescue workflow [106]. The only difference between alternate task and retry technique is that alternate task exchanges a task with a different implementation of the same task with different execution characteristics on the failure of the first one.…”
Section: Task Resubmissionmentioning
confidence: 99%
“…To attain high levels of availability and reliability, the infrastructure of grid must be fault tolerant (Qureshi et al 2011). Avizienis et al (2004) presented a dependability taxonomy that has been extended by incorporating more factors extracted from the literature.…”
Section: Challenges In Grid Dependabilitymentioning
confidence: 99%
“…Similarly, the design goals have also been identified that can lead us to more reliable, available, and secure grid environments. Previously identified and published research (Nazir et al 2012; Haider and Ansari 2012; Haider et al 2011; Qureshi et al 2011; Malik et al 2012; Nazir et al 2009; Khan et al 2010) regarding fault tolerance pertaining to different types of errors, failures, and faults and the corresponding subtypes are also part of this survey, which discloses a very wide range of problems expected in the grid computing environments.…”
Section: Challenges In Grid Dependabilitymentioning
confidence: 99%
“…It has been widely used in solving challenging problems in the real world, such as protein folding [1,2], hydrology modelling [3], and natural disasters simulation [4]. The main reason for deploying grid computing is to introduce a system that is scalable, simple to use, autonomic, and able to deal with faults [5]. Grid computing emerged from meta-computing back in 1990s to support diverse online processing and data intensive application [6,7,8].…”
Section: Introductionmentioning
confidence: 99%