The 20th Asia and South Pacific Design Automation Conference 2015
DOI: 10.1109/aspdac.2015.7059093
|View full text |Cite
|
Sign up to set email alerts
|

A trace-driven approach for fast and accurate simulation of manycore architectures

Abstract: International audienceThe evolution of manycore sytems, forecasted to feature hundreds of cores by the end of the decade calls for efficient solutions for design space exploration and debugging. Among the relevant existing solutions the well-known gem5 simu-lator provides a rich architecture description framework. However , these features come at the price of prohibitive simulation time that limits the scope of possible explorations to configurations made of tens of cores. To address this limitation, this pape… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
24
0

Year Published

2016
2016
2024
2024

Publication Types

Select...
3
3
2

Relationship

3
5

Authors

Journals

citations
Cited by 29 publications
(24 citation statements)
references
References 14 publications
0
24
0
Order By: Relevance
“…The second step is the application execution trace, from start to finish, as the sequential portions of code can hinder the real speedup of a parallel application [17]. The application trace, which was based on [5], provides the execution times among synchronization calls, and the number and execution time of each synchronization call. The trace is further annotated with every synchronization primitives -so that we can simulate these functions accurately.…”
Section: Validation Methodology and Exp Setupmentioning
confidence: 99%
See 1 more Smart Citation
“…The second step is the application execution trace, from start to finish, as the sequential portions of code can hinder the real speedup of a parallel application [17]. The application trace, which was based on [5], provides the execution times among synchronization calls, and the number and execution time of each synchronization call. The trace is further annotated with every synchronization primitives -so that we can simulate these functions accurately.…”
Section: Validation Methodology and Exp Setupmentioning
confidence: 99%
“…We demonstrate our solution by employing the Gem5 simulator with the data processing applications Streamcluster and Bodytrack from the PAR-SEC benchmark. Like Butko et al [5], we produced synchronization points of them; next, we feed this information to an in-house SystemC simulator, which enables us to collect experimental results. The main contributions of this paper are: (i) a solution to accelerate application execution without requiring source code modification; (ii) an enhanced NI architecture able to compute and accelerate expensive synchronization primitives and being compliant with any NI that has access to a local memory; (iii) a set of APIs for parallel computing architectures; (iv) a trace-based simulation tool to allow fast simulations of real parallel applications; and (v) synthesis of the enhanced NI architecture with a 28 nm SOI technology.…”
Section: Introductionmentioning
confidence: 99%
“…Third, the flexibility of gem5 allows users to easily model new architectures, new cache management policies, or any new optimization techniques at architecture level. In addition, gem5 is potentially able to allow exploration of manycore architecture including more than one hundred cores applying a trace-driven approach proposed in [38].…”
Section: A Overviewmentioning
confidence: 99%
“…However, its application is restricted to mono-core execution and no synchronization mechanism is presented. On the other hand, a synchronization mechanism for multi-threaded application is presented in [12]. In a similar way, authors in [13] proposed a collection mechanism for parallel events and a playback methodology to allow architectural exploration.…”
Section: A Traditional Simulatorsmentioning
confidence: 99%
“…SimMATE [12] is a trace-driven simulator that operates on top of gem5 and is devoted to the exploration of in-order manycore architectures. Traces collected on a reference architecture in Full-System mode are made of outgoing memory transactions collected at Level-1 caches, i.e.…”
Section: B Trace-driven Extensions Of Gem5mentioning
confidence: 99%