Proceedings of the 50th Annual Design Automation Conference 2013
DOI: 10.1145/2463209.2488836
|View full text |Cite
|
Sign up to set email alerts
|

Simultaneous multithreading support in embedded distributed memory MPSoCs

Abstract: Scalability and programmability are important issues in large homogeneous MPSoCs. Such architectures often rely on explicit message-passing among processors, each of which possessing a local private memory. This paper presents a low-overhead hardware/software distributed shared memory approach that makes such architectures multithreading-capable. The proposed solution is implemented into an open-source message-passing MPSoC through developing a POSIX-like thread API, which shows excellent scalability using app… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
12
0

Year Published

2013
2013
2020
2020

Publication Types

Select...
4
3
1

Relationship

2
6

Authors

Journals

citations
Cited by 14 publications
(12 citation statements)
references
References 7 publications
0
12
0
Order By: Relevance
“…To incorporate DSM capabilities in the chosen platform [17], we apply a number of modifications summarized in Figure 2. The concerned parts are highlighted: microkernel, software stack allocation and RMA module.…”
Section: Modifications Of Application Runtime and Hardwarementioning
confidence: 99%
See 1 more Smart Citation
“…To incorporate DSM capabilities in the chosen platform [17], we apply a number of modifications summarized in Figure 2. The concerned parts are highlighted: microkernel, software stack allocation and RMA module.…”
Section: Modifications Of Application Runtime and Hardwarementioning
confidence: 99%
“…E consider the open-source and customizable NoCbased MPSoC platform [17] implemented at RTL level. A very interesting feature of this customizable multicore platform is its ability to enable the creation of clusters according to CSM design (see left-hand side of Figure 1).…”
mentioning
confidence: 99%
“…The expanded threads on this core are put in an array T hdArr (line 2). For every pair of threads of this array, the collapsed-mode execution time is calculated using [10] and the cross-thread energy ratio using Equation 2. If the energy ratio is greater than the maximum ratio computed thus far, the maximum value is updated.…”
Section: Energy-aware Thread Collapsingmentioning
confidence: 99%
“…The operating point (v l , f l ) which results in the least positive slack is selected (line 8). The overall threadcentric energy improvement is determined (lines [10][11][12]. If this is < 1 (implying slowdown has a lower energy consumption than race-to-idle), (v l , f l ) is selected as the frequency of the core; else (v N l , f N l ) is selected.…”
Section: E Energy Optimization: Slowdown Vs Race-to-idlementioning
confidence: 99%
“…The collective communication functions defined in the MPI library convert into a set of point-to-point communication functions by the MPI library cell so as to provide the ease of programming. As the collective communication functions account for up to 80% of the data transmission latency, it is very important to improve the process of handling these functions [3,4,5].…”
Section: Introductionmentioning
confidence: 99%