This paper describes a methodology that provides detailed predictive performance information throughout the software design and implementation cycles. It is structured around a hierarchy of performance models that describe the computing system in terms of its software, parallelization, and hardware components. The methodology is illustrated with an implementation, the performance analysis and characterization environment (PACE) system, which provides information concerning execution time, scalability, and resource use. A principal aim of the work is to provide a capability for rapid calculation of relevant performance numbers without sacrificing accuracy. The predictive nature of the approach provides both pre and post implementation analyses and allows implementation alternatives to be explored prior to the commitment of an application to a system. Because of the relatively fast analysis times, these techniques can be used at runtime to assist in application steering and scheduling with reference to dynamically changing systems and metacomputing.
Resource management is an important component of a grid computing infrastructure. The scalability and adaptability of such systems are two key challenges that must be addressed. In this work an agent-based resource management system, ARMS, is implemented for grid computing. ARMS utilises the performance prediction techniques of the PACE toolkit to provide quantitative data regarding the performance of complex applications running on a local grid resource. At the meta-level, a hierarchy of homogeneous agents are used to provide a scalable and adaptable abstraction of the system architecture. Each agent is able to cooperate with other agents and thereby provide service advertisement and discovery for the scheduling of applications that need to utilise grid resources. A case study with corresponding experimental results is included to demonstrate the efficiency of the resource management and scheduling system.
Abstract. Workload and resource management are essential functionalities in the software infrastructure for grid computing. The management and scheduling of dynamic grid resources in a scalable way requires new technologies to implement a next generation intelligent grid environment. This work demonstrates that AI technologies can be utilised to achieve effective workload and resource management. A combination of intelligent agents and multi-agent approaches is applied for both local grid resource scheduling and global grid load balancing. Each agent is a representative of a local grid resource and utilises predictive application performance data and iterative heuristic algorithms to engineer local load balancing across multiple hosts. At a higher level of the global grid, agents cooperate with each other to balance workload using a peer-to-peer service advertisement and discovery mechanism. A case study is included with corresponding experimental results to demonstrate that intelligent agents are effective to achieve resource scheduling and load balancing, improve application execution performance and maximise resource utilisation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.