2016
DOI: 10.1109/tcad.2016.2611506
|View full text |Cite
|
Sign up to set email alerts
|

System-Level Optimization of Accelerator Local Memory for Heterogeneous Systems-on-Chip

Abstract: In modern system-on-chip architectures, specialized accelerators are increasingly used to improve performance and energy efficiency. The growing complexity of these systems requires the use of system-level design methodologies featuring high-level synthesis (HLS) for generating these components efficiently. Existing HLS tools, however, have limited support for the system-level optimization of memory elements, which typically occupy most of the accelerator area. We present a complete methodology for designing t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
39
0

Year Published

2017
2017
2022
2022

Publication Types

Select...
4
2
1

Relationship

3
4

Authors

Journals

citations
Cited by 22 publications
(39 citation statements)
references
References 35 publications
(47 reference statements)
0
39
0
Order By: Relevance
“…4). We then extended MNEMOSYNE [16], a prototype CAD tool to generate void Store() { int ping = 0; wait(); while(true) { /// perform DMA request /// ... if (ping) DARKMEM_ACTIVATE(ctrl_B0); else DARKMEM_ACTIVATE(ctrl_B1); /// store produced results /// ... if (ping) DARKMEM_IDLE(ctrl_B0); else DARKMEM_IDLE(ctrl_B1); /// notify compute /// ... ping = !ping; } } Fig. 5.…”
Section: Design Methodologymentioning
confidence: 99%
See 1 more Smart Citation
“…4). We then extended MNEMOSYNE [16], a prototype CAD tool to generate void Store() { int ping = 0; wait(); while(true) { /// perform DMA request /// ... if (ping) DARKMEM_ACTIVATE(ctrl_B0); else DARKMEM_ACTIVATE(ctrl_B1); /// store produced results /// ... if (ping) DARKMEM_IDLE(ctrl_B0); else DARKMEM_IDLE(ctrl_B1); /// notify compute /// ... ping = !ping; } } Fig. 5.…”
Section: Design Methodologymentioning
confidence: 99%
“…• the SRAM banks: based on the technology information (e.g., available SRAM sizes, static power, etc. ), we can implement the corresponding array with different dual-rail SRAMs transparently to the accelerator execution [16]; • the scenario memory controller (SMC): based on the values of the configuration registers, the SMC module determines at the beginning of the execution which banks are not necessary and, therefore, they can be power gated for the entire execution of the accelerator (i.e., until it is configured to start a new execution with different parameters); • the operating mode controller (OMC): based on a set of command signals from the accelerator logic, the OMC module determines the SRAM operating modes (i.e., when to apply power gating to the periphery circuitry or also to the memory cells). The results of these two controllers are then combined with OR gates to determine the actual power management for each single bank.…”
Section: Darkmem Architecturementioning
confidence: 99%
“…For HLS DSE and memory-based optimization, Pilato et al [30] provided a system-level optimization for a memory system in order to automatically generate more efficient architectures by means of their proposed methodology for HLS DSE. Schafer [12] performed a new method to accelerate the HLS DSE by classifying the HLS optimization knobs, and he also performed an HLS resource sharing DSE by fixing the bitwidth of internal variables in [31].…”
Section: Related Workmentioning
confidence: 99%
“…HLS can apply transformations to optimize the performance of the resulting IP component. For example, loop unrolling and multi-port memories can be combined to execute more operations in parallel [18]. An example of resulting FSM is shown in Figure 4a, where the loop body is repeated for a particular number of times based on the value of the parameter ntaps.…”
Section: B Degradation Attackmentioning
confidence: 99%