T-DLA: An Open-source Deep Learning Accelerator for Ternarized DNN Models on Embedded FPGA

Yao, Chen; Zhang, Kai; Cheng, Gong; Hao, Cong; Zhang, Xiaofan; Li, Tao; Chen, Deming

doi:10.1109/isvlsi.2019.00012

Cited by 34 publications

(35 citation statements)

References 10 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Even with the k-means based solutions, the distribution of the weights in the same layer is ignored in the quantization process. However, the consideration of the distribution of the weight data is proven to be effective for the accuracy control in the existing approaches [8], [18].…”

Section: Motivation Of the Vecq Methodsmentioning

confidence: 99%

“…As an effective way to compress DNNs, many quantization methods have been explored [6], [8], [9], [10], [11], [12], [13], [14], [15], [16], [17], [18], [19], [20], [21], [22], [23], [24], [25], [26], [27], [28]. These quantization methods can be roughly categorized into 3 different types based on their objective functions for the quantization process:…”

Section: Related Work and Motivationmentioning

confidence: 99%

“…T-DLA [8] quantizes the scaling factor of ternary weights and full-precision activation into fixed-point numbers and constrains the quantization loss of activation values by adopting infinite norms. Compared with [9], [12], it shifts the available bitwidth to the most effective data portion to make full use of the targeted bitwidth.…”

Section: Other Workmentioning

confidence: 99%

“…The lowbitwidth processing, which reduces the cost of the inference by using less memory and reducing the complexity of the • multiply-accumulate operation, improves the efficiency of the execution of the model significantly [5], [7]. However, lowering the bitwidth of the data often brings accuracy degradation [4], [8], [9]. This requires the quantization solution to balance between computing efficiency and final model accuracy.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

VecQ: Minimal Loss DNN Model Compression With Vectorized Weight Quantization

Cheng

Yao

et al. 2021

IEEE Trans. Comput.

Self Cite

View full text Add to dashboard Cite

Quantization has been proven to be an effective method for reducing the computing and/or storage cost of DNNs. However, the trade-off between the quantization bitwidth and final accuracy is complex and non-convex, which makes it difficult to be optimized directly. Minimizing direct quantization loss (DQL) of the coefficient data is an effective local optimization method, but previous works often neglect the accurate control of the DQL, resulting in a higher loss of the final DNN model accuracy. In this paper, we propose a novel metric, called Vector Loss. Using this new metric, we decompose the minimization of the DQL to two independent optimization processes, which significantly outperform the traditional iterative L2 loss minimization process in terms of effectiveness, quantization loss as well as final DNN accuracy. We also develop a new DNN quantization solution called VecQ, which provides minimal direct quantization loss and achieve higher model accuracy. In order to speed up the proposed quantization process during model training, we accelerate the quantization process with a parameterized probability estimation method and template-based derivation calculation. We evaluate our proposed algorithm on MNIST, CIFAR, ImageNet, IMDB movie review and THUCNews text data sets with numerical DNN models. The results demonstrate that our proposed quantization solution is more accurate and effective than the state-of-the-art approaches yet with more flexible bitwidth support. Moreover, the evaluation of our quantized models on Saliency Object Detection (SOD) tasks maintains comparable feature extraction quality with up to 16× weight size reduction.Index Terms-DNN compression, DNN quantization, vectorized weight quantization, low bitwidth, vector loss. Reconfigurable Technology and Systems (TRETS). His current research interests include system-level and high-level synthesis, machine learning, GPU and reconfigurable computing, computational genomics, and hardware security.

show abstract

Section: Motivation Of the Vecq Methodsmentioning

confidence: 99%

Section: Related Work and Motivationmentioning

confidence: 99%

Section: Other Workmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

VecQ: Minimal Loss DNN Model Compression With Vectorized Weight Quantization

Cheng

Yao

et al. 2021

IEEE Trans. Comput.

Self Cite

View full text Add to dashboard Cite

show abstract

“…Finally, a recent trend in embedded machine learning for IoT devices includes the use of hardware accelerators for neural networks. Examples can be found in academic research [26][27][28] and off-the-shelf industrial solutions [29]. Although leveraging such hardware is important whenever available, our work targets wearable systems with no special hardware capabilities; hence, the proposed framework operates on general-purpose microcontrollers and is backwards-compatible to legacy IoT systems.…”

Section: Related Workmentioning

confidence: 99%

From Bits of Data to Bits of Knowledge—An On-Board Classification Framework for Wearable Sensing Systems

Zalewski

Marchegiani

Elsts

et al. 2020

Sensors

View full text Add to dashboard Cite

Wearable systems constitute a promising solution to the emerging challenges of healthcare provision, feeding machine learning frameworks with necessary data. In practice, however, raw data collection is expensive in terms of energy, and therefore imposes a significant maintenance burden to the user, which in turn results in poor user experience, as well as significant data loss due to improper battery maintenance. In this paper, we propose a framework for on-board activity classification targeting severely energy-constrained wearable systems. The proposed framework leverages embedded classifiers to activate power-hungry sensing elements only when they are useful, and to distil the raw data into knowledge that is eventually transmitted over the air. We implement the proposed framework on a prototype wearable system and demonstrate that it can decrease the energy requirements by one order of magnitude, yielding high classification accuracy that is reduced by approximately 5%, as compared to a cloud-based reference system.

show abstract

Tulipp and ClickCV: How the Future Demands of Computer Vision Can Be Met Using FPGAs

Swirski¹

2020

Towards Ubiquitous Low-Power Image Processing Platforms

View full text Add to dashboard Cite

T-DLA: An Open-source Deep Learning Accelerator for Ternarized DNN Models on Embedded FPGA

Cited by 34 publications

References 10 publications

VecQ: Minimal Loss DNN Model Compression With Vectorized Weight Quantization

VecQ: Minimal Loss DNN Model Compression With Vectorized Weight Quantization

From Bits of Data to Bits of Knowledge—An On-Board Classification Framework for Wearable Sensing Systems

Tulipp and ClickCV: How the Future Demands of Computer Vision Can Be Met Using FPGAs

Contact Info

Product

Resources

About