2021 58th ACM/IEEE Design Automation Conference (DAC) 2021
DOI: 10.1109/dac18074.2021.9586295
|View full text |Cite
|
Sign up to set email alerts
|

Dancing along Battery: Enabling Transformer with Run-time Reconfigurability on Mobile Devices

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
1
1

Relationship

2
6

Authors

Journals

citations
Cited by 14 publications
(3 citation statements)
references
References 20 publications
0
3
0
Order By: Relevance
“…SMOF [8] put more effort into reducing kernel size and the number of filter channels to overcome fixed-size width constraints in SIMD units. For AutoML framework on edge devices, [9] combined hardware and software reconfiguration through reinforcement learning to explore a hybrid structured pruning for Transformer. Similarly, [10] designed an algorithm-hardware closed-loop framework to efficiently find the best device to deploy the given transformer model.…”
Section: Related Workmentioning
confidence: 99%
“…SMOF [8] put more effort into reducing kernel size and the number of filter channels to overcome fixed-size width constraints in SIMD units. For AutoML framework on edge devices, [9] combined hardware and software reconfiguration through reinforcement learning to explore a hybrid structured pruning for Transformer. Similarly, [10] designed an algorithm-hardware closed-loop framework to efficiently find the best device to deploy the given transformer model.…”
Section: Related Workmentioning
confidence: 99%
“…TranCIM incorporates a Sparse Attention Scheduler (SAS) that dynamically configures the in-memory computing workload to accommodate various sparse attention patterns. Song et al 15 presented a two-level pruning to accommodate transformers on mobile devices, but their study ignores the underlying implementation. Qi et al 16 , on the other hand, presented an efficient acceleration system that combines balanced model compression with FPGA implementation optimization.…”
Section: Introductionmentioning
confidence: 99%
“…To implement these models efficiently on devices, various model compression, accelerator design, and hardware/software co-design techniques [5][6][7][8][9][10][11][12][13] have been proposed to achieve both high accuracy and efficiency. Unfortunately, most of the existing AI system designs only pursue high overall accuracy and ignore fairness among diverse groups in the dataset.…”
Section: Introductionmentioning
confidence: 99%