MUSA: A Multi-level Simulation Approach for Next-Generation HPC Machines

Grass, Thomas; Allande, Cesar; Armejach, Adrià; Rico, Alejandro; Ayguadé, Eduard; Labarta, Jesús; Valero, Mateo; Casas, Marc; Moretó, Miquel

doi:10.1109/sc.2016.44

Cited by 32 publications

(35 citation statements)

References 49 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Skeletons are code extractions of the most important parts of a complex application whereas we only modify a few dozens of lines of HPL before emulating it with SMPI. Finally, it is important to understand that the approach we propose is intended to help studies at the level of the whole machine and application, not the influence of microarchitectural details as intended by MUSA [15].…”

Section: Related Workmentioning

confidence: 99%

Fast and Faithful Performance Prediction of MPI Applications: the HPL Case Study

Cornebize

Legrand

Heinrich

2019

2019 IEEE International Conference on Cluster Computing (CLUSTER)

View full text Add to dashboard Cite

Section: Related Workmentioning

confidence: 99%

Fast and Faithful Performance Prediction of MPI Applications: the HPL Case Study

Cornebize

Legrand

Heinrich

2019

2019 IEEE International Conference on Cluster Computing (CLUSTER)

View full text Add to dashboard Cite

“…To evaluate TaskGenX we make use of the TaskSim simulator [16,24]. TaskSim is a trace driven simulator, that supports the specification of homogeneous or heterogeneous systems with many cores.…”

Section: Simulationmentioning

confidence: 99%

“…TaskSim is a trace driven simulator, that supports the specification of homogeneous or heterogeneous systems with many cores. The tracing overhead of the simulator is less than 10% and the simulation is accurate as long as there is no contention in the shared memory resources on a real system [16]. By default, TaskSim allows the specification of the amount of cores and supports up to two core types in the case of heterogeneous asymmetric systems.…”

Section: Simulationmentioning

confidence: 99%

TaskGenX: A Hardware-Software Proposal for Accelerating Task Parallelism

Chronaki

Casas

Moretó

et al. 2018

Lecture Notes in Computer Science

Self Cite

View full text Add to dashboard Cite

As chip multi-processors (CMPs) are becoming more and more complex, software solutions such as parallel programming models are attracting a lot of attention. Task-based parallel programming models offer an appealing approach to utilize complex CMPs. However, the increasing number of cores on modern CMPs is pushing research towards the use of fine grained parallelism. Task-based programming models need to be able to handle such workloads and offer performance and scalability. Using specialized hardware for boosting performance of task-based programming models is a common practice in the research community. Our paper makes the observation that task creation becomes a bottleneck when we execute fine grained parallel applications with many taskbased programming models. As the number of cores increases the time spent generating the tasks of the application is becoming more critical to the entire execution. To overcome this issue, we propose TaskGenX. TaskGenX offers a solution for minimizing task creation overheads and relies both on the runtime system and a dedicated hardware. On the runtime system side, TaskGenX decouples the task creation from the other runtime activities. It then transfers this part of the runtime to a specialized hardware. We draw the requirements for this hardware in order to boost execution of highly parallel applications. From our evaluation using 11 parallel workloads on both symmetric and asymmetric systems, we obtain performance improvements up to 15×, averaging to 3.1× over the baseline.

show abstract

“…Apart from assessing requirements and enabling software and future technologies, ARM and its partners have focused on research for future processor technologies including architecture and micro-architecture solutions for high-end systems with a focus on HPC. Simulation tools are critical for research and development and several new methodologies have been proposed and implemented to enable simulation of large HPC systems [15,12,11,10,8].…”

Section: Road To Hpcmentioning

confidence: 99%

ARM HPC Ecosystem and the Reemergence of Vectors

Rico

Joao

Adeniyi-Jones

et al. 2017

Proceedings of the Computing Frontiers Conference

Self Cite

View full text Add to dashboard Cite

ARM's involvement in funded international projects has helped pave the road towards ARM-based supercomputers. ARM and its partners have collaborately grown an HPC ecosystem with software and hardware solutions that provide choice in a unified software ecosystem. Partners have announced important HPC deployments resulting from collaborations around the globe. One of the key enabling technologies for ARM in HPC is the Scalable Vector Extension, an instruction set extension for vector processing. This paper discusses ARM's journey into HPC, the current state of the ARM HPC ecosystem, the approach to HPC node architecture co-design, and details on the Scalable Vector Extension as a future technology representing the reemergence of vectors.

show abstract

MUSA: A Multi-level Simulation Approach for Next-Generation HPC Machines

Cited by 32 publications

References 49 publications

Fast and Faithful Performance Prediction of MPI Applications: the HPL Case Study

Fast and Faithful Performance Prediction of MPI Applications: the HPL Case Study

TaskGenX: A Hardware-Software Proposal for Accelerating Task Parallelism

ARM HPC Ecosystem and the Reemergence of Vectors

Contact Info

Product

Resources

About