Parallel Multi-core Verilog HDL Simulation Using Domain Partitioning

Ahmad, Tariq; Ciesielski, Maciej

doi:10.1109/isvlsi.2014.47

Cited by 3 publications

(1 citation statement)

References 4 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In the case of RTL simulations, there are proposals that use MPI, OpenMP or a custom simulator to accelerate simulation. Tariq et al [2] make use of domain partitioning and OpenMP, obtaining up to a 3.3× simulation time speedup. Essent [8] proposes a new simulator that uses an intermediate language for hardware: FIRRTL [15], which accelerates simulations with practical techniques to reuse and avoid doing extra work.…”

Section: G Metro-mpi Heterogeneitymentioning

confidence: 99%

Fast Behavioural RTL Simulation of 10B Transistor SoC Designs with Metro-Mpi

López-Paradís

Armejach

et al. 2023

2023 Design, Automation &Amp; Test in Europe Conference &Amp; Exhibition (DATE)

View full text Add to dashboard Cite

Chips with tens of billions of transistors have become today's norm. These designs are straining our electronic design automation tools throughout the design process, requiring ever more computational resources. In many tools, parallelisation has improved both latency and throughput for the designer's benefit. However, tools largely remain restricted to a single machine and in the case of RTL simulation, we believe that this leaves much potential performance on the table.We introduce Metro-MPI to improve RTL simulation for modern 10 billion transistor-scale chips. Metro-MPI exploits the natural boundaries present in chip designs to partition RTL simulations and leverage High Performance Computing (HPC) techniques to extract parallelism. For chip designs that scale in size by exploiting latency-insensitive interfaces like networkson-chip and AXI, Metro-MPI offers a new paradigm for RTL simulation scalability. Our implementation of Metro-MPI in Open-Piton+Ariane delivers 2.7 MIPS of RTL simulation throughput for the first time on a design with more than 10 billion transistors and 1,024 Linux-capable cores, opening new avenues for distributed RTL simulation of emerging system-on-chip designs. Compared to sequential and multithreaded RTL simulations of smaller designs, Metro-MPI achieves up to 135.98× and 9.29× speedups. Similarly, for a representative regression run, Metro-MPI reduces energy consumption by up to 2.53× and 2.91×.

show abstract

Section: G Metro-mpi Heterogeneitymentioning

confidence: 99%