2023
DOI: 10.1109/tcad.2022.3212645
|View full text |Cite
|
Sign up to set email alerts
|

AdaPT: Fast Emulation of Approximate DNN Accelerators in PyTorch

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
2
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 19 publications
(6 citation statements)
references
References 14 publications
0
6
0
Order By: Relevance
“…Moreover, as shown in Fig. 1 (a-d), the peak ARED of our scaleTRIM (3,4) has the least ARED among the other state-of-the-art works. Our Novel Contributions: In this paper, a novel scalable approximate multiplier that utilizes a lookup table-based compensation unit to reduce the approximation error is proposed.…”
Section: Introductionmentioning
confidence: 76%
See 1 more Smart Citation
“…Moreover, as shown in Fig. 1 (a-d), the peak ARED of our scaleTRIM (3,4) has the least ARED among the other state-of-the-art works. Our Novel Contributions: In this paper, a novel scalable approximate multiplier that utilizes a lookup table-based compensation unit to reduce the approximation error is proposed.…”
Section: Introductionmentioning
confidence: 76%
“…However, the MARED of one of our proposed multiplier configurations (e.g. scaleTRIM (3,4)) is equal to 3.73%. Moreover, as shown in Fig.…”
Section: Introductionmentioning
confidence: 90%
“…MARLIN was run on a 32-thread Ryzen 5950X CPU with 64GB DDR4 RAM and an Nvidia Quadro RTX A5000. The GPU was used only during the initial training of the FP32 and exact INT8 NNs presented in Table VIII, while the CPU was used to simulate the approximate convolutional layers during the training, validation, and test done during the search, as AdaPT only supports CPU computation [33]. The number of threads used during the computation was set to 16 for every experiment to compare how MARLIN execution time scales with different NNs depths.…”
Section: E Discussionmentioning
confidence: 99%
“…A single runtime reconfigurable approximate unit or several multipliers with fixed approximation levels could be integrated into the framework with little or no modification. Once this high-level description of the computational unit is available, the approximate model can be implemented and tested through the AdaPT library [33]. Any NN topology built with PyTorch's convolutional and fully connected layers can be easily included in MARLIN by overloading the layer definitions with the AdaPT ones without retraining or changing the model architecture.…”
Section: Proposed Methodology a Overviewmentioning
confidence: 99%
“…Additionally, a design of a MAC (multiply-accumulate) unit in a systolic array is synthesized for ASIC to better illustrate the results in an AI core. The impact on the accuracy of the proposed adaptive multiplier is studied on different networks (i.e., LeNet-5, AlexNet, ResNet-18, VGG-16, DenseNet) trained on MNIST and CIFAR-10 using 8-bit INT with the help of the AdaPT framework [27]. Finally, the impact of the proposed multiplier on the reliability of DNNs is studied using the mentioned benchmarks.…”
Section: A Experimental Setupmentioning
confidence: 99%