FPGAN: An FPGA Accelerator for Graph Attention Networks With Software and Hardware Co-Optimization

Yan, Weian; Tong, Weiqin; Zhi, Xiaoli

doi:10.1109/access.2020.3023946

Cited by 16 publications

(8 citation statements)

References 22 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…FPGAN ( Yan, Tong & Zhi, 2020 ) is based on FPGA to accelerate the inference process of GAT. FPGAN designs a shift calculation unit for the intensive exp operation in GAT, which eliminates the dependence of computing performance on DSP, and uses an exponential approximation algorithm to fit SoftMax to normalize the attention coefficient.…”

Section: Fpga Based Hardware Acceleratorsmentioning

confidence: 99%

“…To save memory and reduce the computational difficulty, FPGAN ( Yan, Tong & Zhi, 2020 ) compresses the model, and its core idea is to convert the weights to powers of 0 or 2 and judge whether retraining is required by observing the loss of accuracy after conversion. If the compression accuracy loss for one set is within a reasonable range, start the next set of compressions.…”

Section: Fpga Based Hardware Acceleratorsmentioning

confidence: 99%

“…It is currently a popular and effective method to design accelerators for corresponding GCNs models using FPGA that take into account fine-grained computation, high parallelism, and programmability, such as AWB-GCN ( Geng et al, 2020 ), LW-GCN ( Tao et al, 2021 ), FPGAN ( Yan, Tong & Zhi, 2020 ), BoostGCN ( Zhang, Kannan & Prasanna, 2021 ), I-GCN ( Geng et al, 2021b ), etc . These customized FPGA models all completed the inference of GCNs efficiently through specific optimization.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

A survey of field programmable gate array (FPGA)-based graph convolutional neural network accelerators: challenges and opportunities

Yao

Tang

et al. 2022

PeerJ Computer Science

View full text Add to dashboard Cite

Graph convolutional networks (GCNs) based on convolutional operations have been developed recently to extract high-level representations from graph data. They have shown advantages in many critical applications, such as recommendation system, natural language processing, and prediction of chemical reactivity. The problem for the GCN is that its target applications generally pose stringent constraints on latency and energy efficiency. Several studies have demonstrated that field programmable gate array (FPGA)-based GCNs accelerators, which balance high performance and low power consumption, can continue to achieve orders-of-magnitude improvements in the inference of GCNs models. However, there still are many challenges in customizing FPGA-based accelerators for GCNs. It is necessary to sort out the current solutions to these challenges for further research. For this purpose, we first summarize the four challenges in FPGA-based GCNs accelerators. Then we introduce the process of the typical GNN algorithm and several examples of representative GCNs. Next, we review the FPGA-based GCNs accelerators in recent years and introduce their design details according to different challenges. Moreover, we compare the key metrics of these accelerators, including resource utilization, performance, and power consumption. Finally, we anticipate the future challenges and directions for FPGA-based GCNs accelerators: algorithm and hardware co-design, efficient task scheduling, higher generality, and faster development.

show abstract

Section: Fpga Based Hardware Acceleratorsmentioning

confidence: 99%

Section: Fpga Based Hardware Acceleratorsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

A survey of field programmable gate array (FPGA)-based graph convolutional neural network accelerators: challenges and opportunities

Yao

Tang

et al. 2022

PeerJ Computer Science

View full text Add to dashboard Cite

show abstract

“…2c, mean pool uses many ThCudaTensor_scatterAddKernels which are also present in the Aggregation phase. Again, similar to previous GNN accelerators [3], [4], [11], [21], [27], [39], [41]- [43], this work will focus only on the Aggregation and Combination phases, as the main kernels in aggregation and combination consume a majority of the GNN inference runtime.…”

Section: B Pytorch Geometric Characterizationmentioning

confidence: 99%

“…Computing GNN inference requires a mix of memory and compute intensive operations, which commodity CPUs, GPUs and traditional DNN accelerators do not exploit efficiently [27], [39], [40], [44]. This led to the development of many dedicated GNN accelerators, each with their own design methodology to extract as much performance as possible [3], [4], [11], [21], [27], [39], [41]- [43].…”

Section: Introductionmentioning

confidence: 99%

A Taxonomy for Classification and Comparison of Dataflows for GNN Accelerators

Garg¹,

Qin²,

Muñoz-Martínez³

et al. 2021

View full text Add to dashboard Cite

Recently, Graph Neural Networks (GNNs) have received a lot of interest because of their success in learning representations from graph structured data. However, GNNs exhibit different compute and memory characteristics compared to traditional Deep Neural Networks (DNNs). Graph convolutions require feature aggregations from neighboring nodes (known as the aggregation phase), which leads to highly irregular data accesses. GNNs also have a very regular compute phase that can be broken down to matrix multiplications (known as the combination phase). All recently proposed GNN accelerators utilize different dataflows and microarchitecture optimizations for these two phases. Different communication strategies between the two phases have been also used. However, as more custom GNN accelerators are proposed, the harder it is to qualitatively classify them and quantitatively contrast them. In this work, we present a taxonomy to describe several diverse dataflows for running GNN inference on accelerators. This provides a structured way to describe and compare the design-space of GNN accelerators.

show abstract