2019
DOI: 10.48550/arxiv.1910.04540
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

QPyTorch: A Low-Precision Arithmetic Simulation Framework

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
3

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 4 publications
0
3
0
Order By: Relevance
“…A. Platform, Datasets, and Models PyTorch and QPyTorch [10] were used as the frameworks to study the proposed method. Four commonly used NLP datasets UDPOS [11], SNLI [12], Multi30K [13], and WikiText-2 [14] were used in the simulations.…”
Section: Simulation and Discussionmentioning
confidence: 99%
“…A. Platform, Datasets, and Models PyTorch and QPyTorch [10] were used as the frameworks to study the proposed method. Four commonly used NLP datasets UDPOS [11], SNLI [12], Multi30K [13], and WikiText-2 [14] were used in the simulations.…”
Section: Simulation and Discussionmentioning
confidence: 99%
“…We slightly modified the self attention layers in Longformer and ViL by inserting quantization layers after the operators, to simulate the precision on SALO. These quantization layers are implemented by QPyTorch [17], a low-precision arithmetic simulation package. We perform quantization-aware finetuning on both pretrained models.…”
Section: Impact Of Quantizationmentioning
confidence: 99%
“…To extensively evaluate pure 16-bit training with stochastic rounding and Kahan summation, we additionally consider larger datasets and more applications: ResNet-50 on the ImageNet [34], BERT-Base on the Wiki103 language model 2 [35], DLRM model on the Criteo Terabyte dataset [36], and Deepspeech2 [20] on the LibriSpeech datasets [37]. As there is no publicly available accelerator with the software and hardware support necessary for our study, we simulate pure 16-bit training using the QPyTorch simulator [38]. QPyTorch models PyTorch kernels such as matrix multiplication as compute graph operators, and effectively simulates FMAC units with 32-bit accumulators 3 .…”
Section: Experiments In Deep Learningmentioning
confidence: 99%

Revisiting BFloat16 Training

Zamirai,
Zhang,
Aberger
et al. 2020
Preprint
Self Cite