Abstract-Computer grids have attracted great attention of both academic and enterprise communities, becoming an attractive alternative for the execution of applications that demand huge computational power, allowing the integration of computational resources spread through different administrative domains. The dynamic nature of the grid infrastructure, its high scalability, and great heterogeneity exacerbates the likelihood of errors occurrence, imposing fault tolerance as a major requirement for grid middlewares. This paper describes a flexible fault-tolerance mechanism implemented on Integrade grid middleware that allows the customization of several failure handling parameters and the combination of different failure handling techniques. This paper also presents several experiments that measure the benefits of our approach, considering several different execution environments scenarios. I. INTRODUCTIONA computer grid comprises a hardware and software infrastructure that allows integration and sharing of distributed resources, such as software, data and peripherals, inside and among institutions. This computational infrastructure has attracted great attention of academic and enterprise communities, becoming an attractive alternative for execution of applications that demand huge computational power, and allowing the integration of computational resources spread through different administrative domains.Computational grids have been used to solve problems in varied areas of scientific, enterprise, and industrial activities, such as: computational biology, image processing for medical diagnosis, weather forecast, high energy physics, marketing simulations, and oil prospection. Grid computing has empower the conception of a new generation of applications that allow combining computations, experiments, observations, and data got in real time. The phenomena modeled by these applications require diverse software components whose compositions and interactions are extremely dynamic. Moreover, the grid infrastructure is also heterogeneous and dynamic, aggregating a great amount of computation and communication resources, databases and, sometimes, sensors and specific peripherals. The dynamism can be observed in terms of high variation in resource availability, node instability, and workload variations in nodes and network links.The dynamic nature of the grid infrastructure, its high scalability, and great heterogeneity has turn impracticable its
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.