When the workflow application is executed in Service-Oriented Grid (SOG), performance issues such as service scheduling should be considered, to achieve high and stable performance in execution. However, most of the prior works on workflow management neither study the performance issues nor provide evaluation methodologies on the performance of Grid Services. Therefore, it is infeasible to apply for the service scheduling problem in SOG. In this paper, we propose and model evaluation metrics for the Grid Service performance. The metrics are extracted based on common properties of Grid Services and are used to quantify and evaluate the performance of an individual Grid Service. With these metrics, we develop a service scheduling scheme with a list scheduling heuristic, to choose proper and optimal Grid Services for tasks in workflow applications. It ensures high performance in the execution of the workflow applications. In addition, we propose a low-overhead rescheduling method, referred to as Adaptive List Scheduling for Service (ALSS), to adapt to the dynamic nature of a grid environment. ALSS provides stable performance for workflow applications, even in abnormal circumstances. Finally, we design an experimental environment with actual traces and perform simulations to quantify the benefits of our approach. Throughout the experiments, we demonstrate that ALSS outperforms conventional scheduling methods. Our scheme produces a scheduling performance that is superior to AHEFT
In this paper, we address the issues of resource management and fault tolerance in Grids. In Grids, the state of the selected resources for job execution i s a primary factor that determines the computing peformance. Specifically, we propose a resource manager for optimal resource selection. The resource manager automatically selects the optimal resources among candidate resources using a genetic algorithm. .Typically, the probability of failure is higher in the grid computing than in a traditional parallel computing and the failure of resources affrcts job execution fatally. Therefore, a fault tolerance service is essential in computational grid.s and grid services are often expected to meet some minimum levels of Quality of Service (QoS) for desirable operation. To address this issue, we also propose fault tolerance service to surish QoS requirements. We extend the definition of failures, such as process failure. processor failure, and network failure, and design the fault detector and fault manager. The simulation results indicate that our approaches are promising in that (I) our resource manager finds the opfimal set of resources that guarantees the optimal peformance, (2) fault detector detects the occurrence of resource failures and (3) fault manager guarantees that the submitted jobs complete and improves the pe$ormance ofjob execution due to job migration even i f somefailures happen. I This work was done as a part of information & Communication fundamental Technology Research Program supported by Ministv of information 81 Communication in republic of Korea. 0-7803-8430-W04/$20.00 02004 IEEE
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.