Abstract-Architectural complexity continues to grow as we consider the large design space of multiple cores, cache architectures, networks-on-chip and memory controllers for emerging architectures. Simulators are growing in complexity to reflect each of these system components. However, many full-system simulators fail to take advantage of the underlying hardware resources such as multiple cores; as a result, simulation times have grown significantly in recent years. Long turnaround times limit the range and depth of design space exploration that is tractable.Communication has emerged as a first class design consideration and has led to significant research into networks-on-chip (NoC). The NoC is yet another component of the architecture that must be faithfully modeled in simulation. Given its importance, we focus on accelerating NoC simulation through the use of sampling techniques; sampling can provide both accurate results and fast evaluation. We propose NoCLabs and NoCPoint, two sampling methodologies utilizing statistical sampling theory and traffic phase behavior, respectively. Experimental results show that our proposed NoCLabs and NoCPoint estimate NoC performance with an average error of 5% while achieving one order of magnitude speedup on average.
I. INTRODUCTIONAs the number of cores in contemporary processors continues to scale, the criticality of Network-on-Chip (NoC) design to overall performance increases accordingly. NoC designers rely heavily on full-system simulation to faithfully evaluate their designs. In full-system simulation running multithread applications, the interaction between applications, cache coherence protocols and the network is fully exercised; the performance of new designs is accurately evaluated.Although full-system simulation enjoys the benefit of high fidelity, it suffers from prohibitively long turnaround times. Sampled full-system simulation [2], [5], [7], [9] is an effective technique to reduce simulation turnaround times for single-, multi-threaded and multiprogrammed applications. In sampled full-system simulation, only a small but representative portion of the application is simulated in detail. Performance metrics measured with the sampled application are used to estimate the true values of those metrics; the unsampled intervals are either fast forwarded using functional simulation or skipped entirely. Existing work applies to a wide range of applications (single or multi-threaded) and architectures (homogeneous or heterogeneous); however, they mainly focus on evaluating micro-architecture designs, and report metrics such as CPI (single-thread and multiprogrammed applications) or run time (multi-thread applications). To the best of our knowledge, there is no existing work exploring sampling methodologies for NoC simulation. In this paper, we introduce two sampling methodologies for NoC simulation: NoCLabs and NoCPoint.