2015 IFIP/IEEE International Conference on Very Large Scale Integration (VLSI-SoC) 2015
DOI: 10.1109/vlsi-soc.2015.7314386
|View full text |Cite
|
Sign up to set email alerts
|

Tailoring instruction-set extensions for an ultra-low power tightly-coupled cluster of OpenRISC cores

Abstract: Baseline RISC instruction sets for ultra-low power processors are constantly being tuned to reduce cycle count when executing computation-intensive applications. Performance improvements often come at a non-negligible price in terms of area and critical path length and imply deeper pipelines and complex memory interfaces. This penalizes control-intensive code execution and significantly increases cost and complexity of building multi-core clusters. In addition, some extensions are not easily exploited by compi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
12
0

Year Published

2016
2016
2024
2024

Publication Types

Select...
4
3
2

Relationship

4
5

Authors

Journals

citations
Cited by 24 publications
(12 citation statements)
references
References 8 publications
0
12
0
Order By: Relevance
“…The PULP cluster (see Figure 3), that we consider as a baseline for the proposed exploration, includes a configurable number of processing elements (PEs), based on OpenRISC OR10N cores [34] [35], each featuring a private instruction cache. The OR10N cores are based on an inorder, single-issue, four stage pipeline micro-architecture without branch prediction, improved with extensions for higher throughput and energy efficiency in parallel signal processing workloads [34]. No data caches are present, therefore avoiding memory coherency overhead and additional area penalties [30].…”
Section: Soc Architecturementioning
confidence: 99%
“…The PULP cluster (see Figure 3), that we consider as a baseline for the proposed exploration, includes a configurable number of processing elements (PEs), based on OpenRISC OR10N cores [34] [35], each featuring a private instruction cache. The OR10N cores are based on an inorder, single-issue, four stage pipeline micro-architecture without branch prediction, improved with extensions for higher throughput and energy efficiency in parallel signal processing workloads [34]. No data caches are present, therefore avoiding memory coherency overhead and additional area penalties [30].…”
Section: Soc Architecturementioning
confidence: 99%
“…The proposed SoC implements the third generation PULP (Parallel Ultra-Low-Power) platform 1 extended with a dedicated accelerator for convolution intensive processing [16]. The programmable computing engine of the SoC is based on a tightly coupled cluster of 4 OpenRISC ISA cores called OR10N enhanced for energy efficient digital signal processing [17]. The cluster features a shared 4kB latch-based Standard Cell Memory (SCM) [18] instruction cache that, coupled with a private per-core L0 buffer, increases energy efficiency by 30% with respect to an SRAM-based private cache architecture [15].…”
Section: Soc Architecturementioning
confidence: 99%
“…1) MCU: The employed MCU is an implementation derived from the PULP (Parallel-Ultra-Low-Power) Platform [23], [24]. It comes with 4 general-purpose openRISC cores sharing the level 1 (L1) memory -tightly coupled data memory (TCDM).…”
Section: A the Biomedical Soc Architecturementioning
confidence: 99%