Process placement, also called topology mapping, is a well-known strategy to improve parallel program execution by reducing the communication cost between processes. It requires two inputs: the topology of the target machine and a measure of the affinity between processes. In the literature, the dominant affinity measure is the communication matrix that describes the amount of communication between processes. The goal of this paper is to study the accuracy of the communication matrix as a measure of affinity. We have done an extensive set of tests with two fat-tree machines and a 3d-torus machine to evaluate several hypotheses that are often made in the literature and to discuss their validity. First, we check the correlation between algorithmic metrics and the performance of the application. Then, we check whether a good generic process placement algorithm never degrades performance. And finally, we see whether the structure of the communication matrix can be used to predict gain.Key-words: process placement, topology mapping, MPI, communication, algorithm, communication modeling, performance metric Affinité entre les processus, métriques et impact sur les performances : étude expérimentale Résumé : Le placement de processus en prenant en compte la topologie de la machine est une technique bien connue pour réduire le temps d'exécution d'un programme parallèle en diminuant le coût des communications entre les processus. Il nécessite deux entrées : la topologie de la machine cible, et une mesure de l'affinité entre les processus. Dans la littérature, la mesure d'affinité qui prédomine est la matrice de communication qui comptabilise les communications entre les processus. Le but de ce papier est d'étudier la pertinence de la matrice de communication comme mesure de l'affinité. Dans ce but, nous avons réalisé un grand nombre de tests sur une machine de type fat-tree ainsi que sur un tore 3d, afin d'évaluer plusieurs hypothèse qui se retrouvent souvent dans la littérature et de discuter de leur validité. Pour cela, d'abord nous vérifions la corrélation entre des métriques algorithmiques et la performance de l'application. Ensuite, nous contrôlons qu'un bon algorithme de placement n'implique jamais une dégradation des performances d'une application. Et finalement, nous étudions la structure de la matrice de communication dans le but de voir si elle peut être utilisée dans la prédiction du gain.
Interconnection networks in parallel platforms can be made of thousands of nodes and hundreds of switches. The communication cost between tasks of a parallel application varies significantly with their actual location in such platforms. Topology-aware process mapping consists in matching the application communication pattern with the network topology to improve the communication cost by placing related tasks close on the hardware. We show that our Netloc tool for gathering network topology in a generic way can be combined with the state-of-the-art Scotch partitioner for computing topology-aware MPI process placement. Our experiments with a stencil application on a fat-tree machine show that we are able to significantly improve the runtime in the vast majority of cases.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.