Unrolling Ternary Neural Networks

Tridgell, Stephen; Kumm, Martin; Hardieck, Martin; Boland, David; Moss, D.; Zipf, Peter; Leong, Philip H. W.

doi:10.1145/3359983

Cited by 22 publications

(6 citation statements)

References 26 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…After investigating through the recent VGG networks, we propose a similar VGG-style base network (VSBN) model for our task. Note that VGG-style are popularly used in many recent networks, such as References [19,20]. 2a), After the 1 st convolution block (CB), which consists of two repetitions of 2 convolutional layers with 64 kernels with sizes of 3 × 3, abbreviate as 2x(64 3x3), and one max pooling layers, the output is 112 × 112 × 64 (See S2 in Figure 2a).…”

Section: B Vgg-style Base Networkmentioning

confidence: 99%

AVNC: Attention-Based VGG-Style Network for COVID-19 Diagnosis by CBAM

et al. 2022

View full text Add to dashboard Cite

To detect COVID-19 patients more accurately and more precisely, we proposed a novel artificial intelligence model. (Methods) We used previously proposed chest CT dataset containing four categories: COVID-19, community-acquired pneumonia, secondary pulmonary tuberculosis, and healthy subjects. First, we proposed a novel VGG-style base network (VSBN) as backbone network. Second, convolutional block attention module (CBAM) was introduced as attention module into our VSBN. Third, an improved multiple-way data augmentation method was used to resist overfitting of our AI model. In all, our model was dubbed as a 12-layer attention-based VGG-style network for COVID-19 (AVNC) (Results) This proposed AVNC achieved the sensitivity/precision/F1 per class all above 95%. Particularly, AVNC yielded a micro-averaged F1 score of 96.87%, which is higher than 11 state-of-the-art approaches.(Conclusion) This proposed AVNC is effective in recognizing COVID-19 diseases.

show abstract

Section: B Vgg-style Base Networkmentioning

confidence: 99%

AVNC: Attention-Based VGG-Style Network for COVID-19 Diagnosis by CBAM

et al. 2022

View full text Add to dashboard Cite

show abstract

“…This work splits the whole GNN into several sub-layers and adopts a layer-wise hardware architecture [12,2,13,16] to map all the sub-layers on-chip which is flexible and able to take full advantage of the customizability of FPGAs. In addition, we perform the calculation for different sublayers on their own units using dedicated optimization to achieve low latency and high design throughput.…”

Section: Implementation Of the Hardware Acceleratormentioning

confidence: 99%

“…This work proposes a custom Low Latency (LL)-GNN hardware architecture based on a layer-wise tailor-made pipeline to accelerate the GNNs for particle detectors, using the GNN-based JEDI-net algorithm as an end-to-end application. The layer-wise architecture has been used to speedup CNNs [12,13,14,15,16] and RNNs [17], but few studies focus on accelerating GNNs. First, we propose custom strength reduction for matrix operations based on the character of interaction-network based GNNs with a fully connected graph as an input, which avoids the expensive matrix multiplications of the adjacency matrix with the input feature matrix.…”

Section: Introductionmentioning

confidence: 99%

LL-GNN: Low Latency Graph Neural Networks on FPGAs for Particle Detectors

Que¹,

Loo²,

Fan³

et al. 2022

Preprint

View full text Add to dashboard Cite

This work proposes a novel reconfigurable architecture for low latency Graph Neural Network (GNN) design specifically for particle detectors. Accelerating GNNs for particle detectors is challenging since it requires sub-microsecond latency to deploy the networks for online event selection in the Level-1 triggers at the CERN Large Hadron Collider experiments. This paper proposes a custom code transformation with strength reduction for the matrix multiplication operations in the interaction-network based GNNs with fully connected graphs, which avoids the costly multiplication of the adjacency matrix with the input feature matrix. It exploits sparsity patterns as well as binary adjacency matrices, and avoids irregular memory access, leading to a reduction in latency and improvement in hardware efficiency. In addition, we introduce an outer-product based matrix multiplication approach which is enhanced by the strength reduction for low latency design. Also, a fusion step is introduced to further reduce the design latency. Furthermore, an GNN-specific algorithm-hardware co-design approach is presented which not only finds a design with a much better latency but also finds a high accuracy design under a given latency constraint. Finally, a customizable template for this low latency GNN hardware architecture has been designed and open-sourced, which enables the generation of low-latency FPGA designs with efficient resource utilization using a high-level synthesis tool. Evaluation results show that our FPGA implementation is up to 24 times faster and consumes up to 45 times less power than a GPU implementation. Compared to our previous FPGA implementations, this work achieves 6.51 to 16.7 times lower latency. Moreover, the latency of our FPGA design is sufficiently low to enable deployment of GNNs in a sub-microsecond, real-time collider trigger system, enabling it to benefit from improved accuracy.

show abstract

“…However, these methods only quantize network weights into three values and network outputs are still kept as real-valued variables. Recent studies on hardware implementation of ternary neural network [1,24,20] also shown that it is possible to have an efficient and fast ternary-based computation on Field-Programmable Gate Array (FPGA) [3].…”

Section: Ternary Quantization and Hardwarementioning

confidence: 99%

Ternary Hashing

Liu,

Fan,

et al. 2021

Preprint

View full text Add to dashboard Cite

This paper proposes a novel ternary hash encoding for learning to hash methods, which provides a principled more efficient coding scheme with performances better than those of the state-of-the-art binary hashing counterparts. Two kinds of axiomatic ternary logic, Kleene logic and Łukasiewicz logic are adopted to calculate the Ternary Hamming Distance (THD) for both the learning/encoding and testing/querying phases. Our work demonstrates that, with an efficient implementation of ternary logic on standard binary machines, the proposed ternary hashing is compared favorably to the binary hashing methods with consistent improvements of retrieval mean average precision (mAP) ranging from 1% to 5.9% as shown in CIFAR10, NUS-WIDE and ImageNet100 datasets.

show abstract

Unrolling Ternary Neural Networks

Cited by 22 publications

References 26 publications

AVNC: Attention-Based VGG-Style Network for COVID-19 Diagnosis by CBAM

AVNC: Attention-Based VGG-Style Network for COVID-19 Diagnosis by CBAM

LL-GNN: Low Latency Graph Neural Networks on FPGAs for Particle Detectors

Ternary Hashing

Contact Info

Product

Resources

About