2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA) 2022
DOI: 10.1109/hpca53966.2022.00041
|View full text |Cite
|
Sign up to set email alerts
|

GCoD: Graph Convolutional Network Acceleration via Dedicated Algorithm and Accelerator Co-Design

Abstract: Vision Transformers (ViTs) have achieved state-ofthe-art performance on various vision tasks. However, ViTs' selfattention module is still arguably a major bottleneck, limiting their achievable hardware efficiency and more extensive applications to resource constrained platforms. Meanwhile, existing accelerators dedicated to NLP Transformers are not optimal for ViTs. This is because there is a large difference between ViTs and Transformers for natural language processing (NLP) tasks: ViTs have a relatively fix… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
8
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
7
2

Relationship

0
9

Authors

Journals

citations
Cited by 40 publications
(8 citation statements)
references
References 53 publications
0
8
0
Order By: Relevance
“…The authors identify the characteristics of Transformer-based models and propose various optimization methods. ViTCoD [34] designs a dedicated accelerator for sparse and dense workloads to boost hardware utilization for vision transformers. Auto-ViT-Acc [36] designs an FPGA accelerator for multi-head attention and an FPGAaware quantization algorithm to make better use of FPGA resources.…”
Section: Sequential Acceleratorsmentioning
confidence: 99%
“…The authors identify the characteristics of Transformer-based models and propose various optimization methods. ViTCoD [34] designs a dedicated accelerator for sparse and dense workloads to boost hardware utilization for vision transformers. Auto-ViT-Acc [36] designs an FPGA accelerator for multi-head attention and an FPGAaware quantization algorithm to make better use of FPGA resources.…”
Section: Sequential Acceleratorsmentioning
confidence: 99%
“…To take full advantage of this reduction without introducing significant overheads, OuterSPACE builds a custom accelerator with reconfigurable memory hierarchy and achieves a mean speedup of 7.9× over the CPU running Intel Math Kernel Library and 14.0× against the GPU running CUSP. Furthermore, to alleviate the data movement bottleneck caused by high sparsity, ViTCoD [86] uses a learnable auto-encoder to compress the sparse attentions to a much more compact representation and designs encoder and decoder engines to boost the hardware utilization.…”
Section: Memory Efficiencymentioning
confidence: 99%
“…There are also some previous studies about algorithmic and hardware co-design for GNNs [36,46,47]. [47] firstly presents a GNN and accelerator automatically co-search framework to maximize both task accuracy and acceleration efficiency.…”
Section: Related Workmentioning
confidence: 99%
“…[36] proposes a model-architecture co-design with a light-weight algorithm for temporal GNN inferences on FPGAs. [46] proposes GCoD, a GCN algorithm and accelerator Co-Design framework, involving a two-pronged accelerator with a separated engine to process dense and sparse workloads. Some previous studies focus on accelerating GNN training [48,49,10,50].…”
Section: Related Workmentioning
confidence: 99%