2022 IEEE 16th International Symposium on Applied Computational Intelligence and Informatics (SACI) 2022
DOI: 10.1109/saci55618.2022.9919465
|View full text |Cite
|
Sign up to set email alerts
|

Benchmarking TensorFlow Lite Quantization Algorithms for Deep Neural Networks

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
0
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(2 citation statements)
references
References 12 publications
0
0
0
Order By: Relevance
“…Quantization is applied to reduce the numerical representation of the neural network parameters with the aim of decreasing the memory footprint and consequently the model size. Since neural network models are usually highly over-parameterized, the precision could be maintained at a high level [11].…”
Section: B Neural Model Optimization and Compression Techniquesmentioning
confidence: 99%
See 1 more Smart Citation
“…Quantization is applied to reduce the numerical representation of the neural network parameters with the aim of decreasing the memory footprint and consequently the model size. Since neural network models are usually highly over-parameterized, the precision could be maintained at a high level [11].…”
Section: B Neural Model Optimization and Compression Techniquesmentioning
confidence: 99%
“…Furthermore, representations with less than 8 bits have already been proposed, and even binarization [10]. I. Orasan et al [11] investigated several post-training quantization solutions using the TensorFlow Lite deep learning framework on CNN models of different sizes. The obtained compression ratio is up to 4 times, and the worst-case accuracy degradation is only 0.43%.…”
Section: Introductionmentioning
confidence: 99%