2022
DOI: 10.48550/arxiv.2203.15952
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

4-bit Conformer with Native Quantization Aware Training for Speech Recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
8
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(9 citation statements)
references
References 0 publications
1
8
0
Order By: Relevance
“…the best checkpoint based on dev-clean WER and observe no degradation from the test sets with the model size reduced by 6.4×. Although this seems counter-intuitive in that the compressed model outperforms the 32-bit baseline, it is not rare as also shown in [11]. One explanation is that by driving weights towards quantization centroids, the search space is drastically reduced, yielding an arguably easier optimization process.…”
Section: Modelmentioning
confidence: 93%
See 4 more Smart Citations
“…the best checkpoint based on dev-clean WER and observe no degradation from the test sets with the model size reduced by 6.4×. Although this seems counter-intuitive in that the compressed model outperforms the 32-bit baseline, it is not rare as also shown in [11]. One explanation is that by driving weights towards quantization centroids, the search space is drastically reduced, yielding an arguably easier optimization process.…”
Section: Modelmentioning
confidence: 93%
“…In contrast, FP-QAT can be regularizer free [11,21]. Usually, the process is to use a "fake quantizer" or equivalent operations during training, hard quantizing weights to a specific range and bit-depth; and then at runtime, converting the model to INT8 format via TFLite [22].…”
Section: Related Qat Approachesmentioning
confidence: 99%
See 3 more Smart Citations