“…Currently, most prevailing neural image codecs follow the VAE framework [5,6]. A series of works are built upon this framework, improving from the aspects of entropy estimation [11,18,26,40], quantization [1,3,5,19,51], variable rate [12,13] and perceptual quality [4,38]. Among them, we note that the autoregressive context model [26,40] can achieve obvious rate savings but bring much more decoding complexity.…”