“…In particular, [9,15,28] set the foundations for 1-bit quantization, while [16,50] for arbitrary bitwidth quantization. Progressive quantization [2,1,53,33], loss aware-quantization [13,49], improved gradient estimators for non-differentiable functions [21] and RL-aided training [20], have focused on improved training schemes, while mixed precision quantization [36], hardware-aware quantization [37] and architecture search for quantized models [34] have focused on alternatives for standard quantized models. However, these strategies are exclusively focused on improving the performance and efficiency of static networks.…”