Scientific workflows are frequently modeled as Directed Acyclic Graphs (DAG) of tasks, which represent computational modules and their dependencies in the form of data produced by a task and used by another one. This formulation allows the use of runtime systems which dynamically allocate tasks onto the resources of increasingly complex computing platforms. However, for some workflows, such a dynamic schedule may run out of memory by exposing too much parallelism. This paper focuses on the problem of transforming such a DAG to prevent memory shortage, and concentrates on shared memory platforms. We first propose a simple model of DAGs which is expressive enough to emulate complex memory behaviors. We then exhibit a polynomial-time algorithm that computes the maximum peak memory of a DAG, that is, the maximum memory needed by any parallel schedule. We consider the problem of reducing this maximum peak memory to make it smaller than a given bound by adding new fictitious edges, while trying to minimize the critical path of the graph. After proving this problem NP-complete, we provide an ILP solution as well as several heuristic strategies that are thoroughly compared by simulation on synthetic DAGs modeling actual computational workflows. We show that on most instances we are able to decrease the maximum peak memory at the cost of a small increase in the critical path, thus with little impact on quality of the final parallel schedule.
Key-words:
Ordonnancement parallĂšle de DAGs sous contraintes mĂ©moireRĂ©sumĂ© : Les applications de calcul scientifique sont souvent modĂ©lisĂ©es par des graphes de tĂąches orientĂ©s acycliques (DAG), qui reprĂ©sentent les tĂąches de calcul et leurs dĂ©pendances, sous la forme de donnĂ©es produites par une tĂąche et utilisĂ©es par une autre. Cette formulation permet l'utilisation d'API qui allouent dynamiquement les tĂąches sur les ressources de plateformes de calcul hĂ©tĂ©rogĂšnes de plus en plus complexes. Cependant, pour certaines applications, un tel ordonnancement dynamique peut manquer de mĂ©moire en exploitant trop de parallĂ©lisme. Cet article porte sur le problĂšme consistant Ă transformer un tel DAG pour empĂȘcher toute pĂ©nurie de mĂ©moire, en se concentrant sur les plateformes Ă mĂ©moire partagĂ©e. On propose tout d'abord un modĂšle simple de graphe qui est assez expressif pour Ă©muler des comportements mĂ©moires complexes. On expose ensuite un algorithme polynomial qui calcule le pic mĂ©moire maximum d'un DAG, qui reprĂ©sente la mĂ©moire maximale requise par tout ordonnancement parallĂšle. On considĂšre ensuite le problĂšme consistant Ă rĂ©duire ce pic mĂ©moire maximal pour qu'il devienne plus petit qu'une borne donnĂ©e en rajoutant des arĂȘtes fictives, tout en essayant de minimiser le chemin critique du graphe. AprĂšs avoir prouvĂ© ce problĂšme NPcomplet, on fournit un programme linĂ©aire en nombres entiers le rĂ©solvant, ainsi que plusieurs stratĂ©gies heuristiques qui sont minitieusement comparĂ©es-sur des graphes synthĂ©tiques modĂ©lisant des applications de calcul rĂ©elles. On montre que sur la plupar...