2020
DOI: 10.1109/tcad.2020.2981056
|View full text |Cite
|
Sign up to set email alerts
|

Safe Overclocking for CNN Accelerators Through Algorithm-Level Error Detection

Abstract: In this paper, we propose a technique for improving the efficiency of CNN hardware accelerators based on timing speculation (overclocking) and fault tolerance. We augment the accelerator with a lightweight error detection mechanism to protect against timing errors in convolution layers, enabling aggressive timing speculation. The error detection mechanism we have developed works at the algorithm-level, utilizing algebraic properties of the computation, allowing the full implementation to be realized using High… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
9
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 20 publications
(9 citation statements)
references
References 30 publications
0
9
0
Order By: Relevance
“…The difference is added back to the final result at the next cycle without stalling the pipeline. Lastly, the forefront research of [123] proposes a technique to improve the efficiency of DNN accelerators with spatial architecture based on overclocking (timing speculation) and inherent error resilience. The authors presented an algorithmic-based lightweight TE detection mechanism to protect convolution layers, enabling aggressive timing speculation.…”
Section: Timing Error Detectionmentioning
confidence: 99%
“…The difference is added back to the final result at the next cycle without stalling the pipeline. Lastly, the forefront research of [123] proposes a technique to improve the efficiency of DNN accelerators with spatial architecture based on overclocking (timing speculation) and inherent error resilience. The authors presented an algorithmic-based lightweight TE detection mechanism to protect convolution layers, enabling aggressive timing speculation.…”
Section: Timing Error Detectionmentioning
confidence: 99%
“…While the overhead of the naive ABFT is non-trivial, Dionysios Filippas et al [62] proposed a lightweight ABFT implementation, ConvGuard, which predicts the output checksum of convolution implicitly by accumulating only the pixels at the border of the dropped input features. Thibaut Marty et al [63] proposed to utilize the ABFT technique to mitigate timing errors induced by overclocking of the neural network accelerators on FPGAs. Their experiments reveal that the proposed ABFT design poses negligible area overhead, enables aggressive overclocking of the neural network accelerators, and achieves up to 60% throughput improvement of the overall neural network processing.…”
Section: A Related Workmentioning
confidence: 99%
“…Sung Kim et al [83] proposed to combine adaptive neural network training and weight memory voltage scaling to achieve energy-efficient neural network processing. Similar cross-layer optimizations that utilize voltage scaling and faultaware training or high-level fault correction are also applied in many different scenarios [14] [84] [85] [63]. In summary, cross-layer fault-tolerant approaches show promising results in generally and it can be expected many of the fault-tolerant techniques surveyed in prior sections can also be potentially combined and optimized for more effective protection against hardware faults.…”
Section: Cross-layer Fault Tolerancementioning
confidence: 99%
“…As analysed in [17], ABFT provides for high confidence, i.e., close to 100%, in detecting errors in neural computations. While it is not protecting the look-up-table based non-linear operations in the activation layers of the neural net, the delaypaths of multiplication and addition circuits are far longer and more likely to suffer from lower voltage induced phenomena [28]. In Fig.…”
Section: B Error Detection Through Abftmentioning
confidence: 99%