Proceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques 2018
DOI: 10.1145/3243176.3243184
|View full text |Cite
|
Sign up to set email alerts
|

E-Pur

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
12
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
3

Relationship

2
5

Authors

Journals

citations
Cited by 27 publications
(12 citation statements)
references
References 17 publications
0
12
0
Order By: Relevance
“…This section presents the evaluation of the proposed fuzzy memoization technique for RNNs, implemented on top of E-PUR [30]. We refer to it as E-PUR+BM.…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…This section presents the evaluation of the proposed fuzzy memoization technique for RNNs, implemented on top of E-PUR [30]. We refer to it as E-PUR+BM.…”
Section: Resultsmentioning
confidence: 99%
“…Typically, the number of elements in the weight matrices ranges from a few thousands to millions of elements and, thus, fetching them from on-chip buffers or main memory is one of the major sources of energy consumption. Not surprisingly, it accounts for up to 80% of the total energy consumption in state-of-the-art accelerators [30]. For this reason, a very effective way of saving energy in RNNs is to avoid fetching the synaptic weights.…”
Section: Motivationmentioning
confidence: 99%
See 1 more Smart Citation
“…While for most applications training is a one-time task, and can therefore be performed in the cloud, there is a growing demand for executing NN inference on embedded systems (so-called "edge" nodes), in order to enhance the features of many Internet of Things (IoT) applications [3]. In fact, edge inference could yield benefits in terms of data privacy, response latency and energy efficiency, as it would eliminate the need of transmitting high volumes of raw data to the cloud [4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][19].…”
Section: Introductionmentioning
confidence: 99%
“…One of the most popular approaches is to design custom hardware accelerators to implement the most critical operations involved in the inference phase, which are typically multiplications of large matrices and vectors, in a fast and efficient way. Most accelerators have been designed for convolutional neural networks (CNNs), due to their outstanding results in computer vision applications [3,6,7,12,16], but more recently, hardware acceleration of sequence-to-sequence models, such as RNNs and transformers, has also been investigated extensively [10,15,[17][18][19].…”
Section: Introductionmentioning
confidence: 99%