Nitin Sukhija scite author profile

Ciorba

et al. 2012

To improve their performance, scientific applications often use loop scheduling algorithms as techniques for load balancing data parallel computations. Over the years, a number of dynamic loop scheduling (DLS) techniques have been developed. These techniques are based on probabilistic analyses, and are effective in addressing unpredictable load imbalances in the system arising from various sources, such as, variations in application, algorithmic, and systemic characteristics. Modern, high-end computing facilities can now offer petascale performance (10 15 flops), and several initiatives have already begun with the goal of achieving exascale performance (10 18 flops) towards the end of the current decade. Efficient and scalable algorithms are therefore required to utilize the petascale and exascale resources. In this paper, a study of the scalability of DLS techniques via discrete event simulation is presented, both in terms of number of processors, and problem size. To facilitate the scalability study, a dynamic loop scheduler was designed and was implemented using the SimGrid [1] simulation framework. The results of the study demonstrate the scalability of the DLS techniques and their effectiveness in addressing load imbalance in large scale computing systems.

show abstract

Predicting the Flexibility of Dynamic Loop Scheduling Using an Artificial Neural Network

Malone

et al. 2013

Analyzing the Robustness of Dynamic Loop Scheduling for Heterogeneous Computing Systems

Banicescu

et al. 2012

Portfolio-Based Selection of Robust Dynamic Loop Scheduling Algorithms Using Machine Learning

Malone

et al. 2014

The execution of computationally intensive parallel applications in heterogeneous environments, where the quality and quantity of computing resources available to a single user continuously change, often leads to irregular behavior, in general due to variations of algorithmic and systemic nature. To improve the performance of scientific applications, loop scheduling algorithms are often employed for load balancing of their parallel loops. However, it is a challenge to select the most robust scheduling algorithms for guaranteeing optimized performance of scientific applications on large-scale computing systems that comprise resources which are widely distributed, highly heterogeneous, often shared among multiple users, and have computing availabilities that cannot always be guaranteed or predicted. To address this challenge, in this work we focus on a portfolio-based approach to enable the dynamic selection and use of the most robust dynamic loop scheduling (DLS) algorithm from a portfolio of DLS algorithms, depending on the given application and current system characteristics including workload conditions. Thus, in this paper we provide a solution to the algorithm selection problem and experimentally evaluate its quality. We propose the use of supervised machine learning techniques to build empirical robustness prediction models that are used to predict DLS algorithm's robustness for given scientific application characteristics and system availabilities. Using simulated scientific applications characteristics and system availabilities, along with empirical robustness prediction models, we show that the proposed portfolio-based approach enables the selection of the most robust DLS algorithm that satisfies a user-specified tolerance on the given application's performance obtained in the particular computing system with a certain variable availability. We also show that the portfoliobased approach offers higher guarantees regarding the robust performance of the application using the automatically selected DLS algorithms when compared to the robust performance of the same application using a manually selected DLS algorithm.

show abstract

Evaluating the Flexibility of Dynamic Loop Scheduling on Heterogeneous Systems in the Presence of Fluctuating Load Using SimGrid

Banicescu

et al. 2013