The 6th IEEE/ACM International Workshop on Grid Computing, 2005. 2005
DOI: 10.1109/grid.2005.1542744
|View full text |Cite
|
Sign up to set email alerts
|

Reliability-aware resource management for computational grid/cluster environments

Abstract: The collective resource utilization achieved through grid computing is critical to the overall computing capacity of the collaborative community and should be guaranteed. Especially, in an existing environment where job sites are Beowulf cluster systems, a service node failure may render the whole system outage. Current grid fault tolerance techniques only address these issues in an opportunistic fashion. Thus, there is a need for complementing these approaches by proactively handling failures at a job-site le… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2008
2008
2011
2011

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 13 publications
(5 citation statements)
references
References 7 publications
0
5
0
Order By: Relevance
“…SOAP request with small number of parameters or data), and not tasks that would involve time-intensive replication (such as duplication of large data sets) and whose handling would influence the overall performance of the proposed scheme. Other research works that are based on the same approach of a negligible replication overhead are presented in [27][28][29].…”
Section: Fault-tolerance Replication Modelmentioning
confidence: 97%
See 1 more Smart Citation
“…SOAP request with small number of parameters or data), and not tasks that would involve time-intensive replication (such as duplication of large data sets) and whose handling would influence the overall performance of the proposed scheme. Other research works that are based on the same approach of a negligible replication overhead are presented in [27][28][29].…”
Section: Fault-tolerance Replication Modelmentioning
confidence: 97%
“…SOAP request with small number of parameters or data), and not tasks that would involve time-intensive replication (such as duplication of large data sets) and whose handling would influence the overall performance of the proposed scheme. Other research works that are based on the same approach of a negligible replication overhead are presented in [27][28][29]. Furthermore, by producing task replicas, the failure probability of each task can be significantly lowered; however, the number of tasks that are finally assigned to the mobile Grid increases, increasing, respectively, the total workload that is assigned to the Grid for execution.…”
Section: Fault-tolerance Replication Modelmentioning
confidence: 99%
“…Single points of failure can be avoided through implementation of redundancy, such as the backup generator in Fig. 1 [26]. • State control: Each smart grid operator object (e.g., voltage stabilizer in Fig.…”
Section: Smart Grid Object Requirementsmentioning
confidence: 99%
“…Clusters [9][10][11] are deployed to improve reliability and availability in safety-critical systems, such as Google Linux Cluster. A cluster system consists of a group of independent computers running an independent operating system and working together as a single system to provide a powerful computing environment and high availability of services, particularly for computation intensive tasks.…”
Section: Introductionmentioning
confidence: 99%