2015
DOI: 10.1007/s11265-015-1056-7
|View full text |Cite
|
Sign up to set email alerts
|

He-P2012: Performance and Energy Exploration of Architecturally Heterogeneous Many-Cores

Abstract: The end of Dennardian scaling in advanced technologies brought about new architectural templates to overcome the so-called utilization wall and provide Moore's Law-like performance and energy scaling in embedded SoCs. One of the most promising templates, architectural heterogeneity, is hindered by high cost due to the design space explosion and the lack of effective exploration tools. Our work provides three contributions towards a scalable and effective methodology for design space exploration in embedded MC-… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0
1

Year Published

2016
2016
2021
2021

Publication Types

Select...
3
1

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(3 citation statements)
references
References 37 publications
0
2
0
1
Order By: Relevance
“…Hardware Processing Elements: PULP clusters can be enhanced with application-specific accelerators, called Hardware Processing Elements (HWPE) [21], to deliver higher levels of performance and energy efficiency for specific tasks. Unlike many accelerator designs HWPEs do not necessarily rely on an external DMA to feed them with input data and extract output data, and they are not tied to a single core.…”
Section: B the Pulp Project And Heromentioning
confidence: 99%
“…Hardware Processing Elements: PULP clusters can be enhanced with application-specific accelerators, called Hardware Processing Elements (HWPE) [21], to deliver higher levels of performance and energy efficiency for specific tasks. Unlike many accelerator designs HWPEs do not necessarily rely on an external DMA to feed them with input data and extract output data, and they are not tied to a single core.…”
Section: B the Pulp Project And Heromentioning
confidence: 99%
“…The ATU is a specialized component similar to the memory management unit (MMU) for processor cores. Differently from the MMU that uses a TLB to convert the addresses, the ATU design is customized for the specific data structure [34] and to guarantee that the translation does not introduce any Architecture of the ATU proposed in this paper and address decomposition.…”
Section: Generation Of the Plm Controllermentioning
confidence: 99%
“…По скорости доступа к видеопамяти GPU также значительно превосходит CPU. Эффективная организация подсистемы памяти повышает общую эффективность графического процессора при работе с неграфическими задачами [5][6][7][8][9].…”
unclassified