Image Coding With Data-Driven Transforms: Methodology, Performance and Potential

Zhang, Xinfeng; Yang, Chao; Li, Xiaoguang; Liu, Shan; Yang, Haitao; Katsavounidis, Ioannis; Lei, Shawmin; Kuo, C.-C. Jay

doi:10.1109/tip.2020.3025203

Cited by 26 publications

(17 citation statements)

References 46 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Specifically, Multiple Transform Selection (MTS) is introduced in VVC to select the most desirable transform with the best rate-distortion performance. Data-driven transforms like KLT have also been explored for image compression [36], where multiple KLT candidates are trained from different clusters of multiscale patches. Inspired by the idea of signal-dependent transform selection in previous image coding methods, our model adopts end-to-end learned neural networks to generate data-dependent transforms for more efficient image compression.…”

Section: Hybrid Image Compressionmentioning

confidence: 99%

“…One branch of researches focuses on designing more powerful transforms, e.g. the improved variants [28,29,35] of discrete cosine transform (DCT) [8,13], and the theoretically optimal lin-ear Karhunen-Loève transform (KLT) [36]. Although more decorrelated and energy-compact coefficients are obtained to improve the coding performance, these methods heavily rely on the distribution, and therefore are not general and flexible enough.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Neural Data-Dependent Transform for Learned Image Compression

Wang¹,

Yang²,

Hu³

et al. 2022

Preprint

View full text Add to dashboard Cite

Learned image compression has achieved great success due to its excellent modeling capacity, but seldom further considers the Rate-Distortion Optimization (RDO) of each input image. To explore this potential in the learned codec, we make the first attempt to build a neural data-dependent transform and introduce a continuous online mode decision mechanism to jointly optimize the coding efficiency for each individual image. Specifically, apart from the image content stream, we employ an additional model stream to generate the transform parameters at the decoder side. The presence of a model stream enables our model to learn more abstract neural-syntax, which helps cluster the latent representations of images more compactly. Beyond the transform stage, we also adopt neural-syntax based post-processing for the scenarios that require higher quality reconstructions regardless of extra decoding overhead. Moreover, the involvement of the model stream further makes it possible to optimize both the representation and the decoder in an online way, i.e. RDO at the testing time. It is equivalent to a continuous online mode decision, like coding modes in the traditional codecs, to improve the coding efficiency based on the individual input image. The experimental results show the effectiveness of the proposed neuralsyntax design and the continuous online mode decision mechanism, demonstrating the superiority of our method in coding efficiency compared to the latest conventional standard Versatile Video Coding (VVC) and other state-ofthe-art learning-based methods. Our project is available at: https://dezhao-wang.github.io/Neural-Syntax-Website/.

show abstract

Section: Hybrid Image Compressionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Neural Data-Dependent Transform for Learned Image Compression

Wang¹,

Yang²,

Hu³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…KLT is a signal-dependent linear transform. As a data-driven transform, it KLT has been applied in image coding [34], image quality assessment [35,36] and achieved promising results due to its excellent decorrelated performance. The KLT domain and spatial domain can be converted from one to another without loss of information.…”

Section: A Design Philosophy and Motivationmentioning

confidence: 99%

Towards Top-Down Just Noticeable Difference Estimation of Natural Images

Jiang,

Liu,

Wang

et al. 2021

Preprint

View full text Add to dashboard Cite

Just noticeable difference (JND) of natural images refers to the maximum change magnitude that the typical human visual system (HVS) cannot perceive. Existing efforts on JND estimation mainly dedicate to modeling the visibility masking effects of different factors in spatial and frequency domains, and then fusing them into an overall JND estimate. However, the overall visibility masking effect can be related with more contributing factors beyond those have been considered in the literature and it is also insufficiently accurate to formulate the masking effect even for an individual factor. Moreover, the potential interactions among different masking effects are also difficult to be characterized with a simple fusion model. In this work, we turn to a dramatically different way to address these problems with a top-down design philosophy. Instead of formulating and fusing multiple masking effects in a bottom-up way, the proposed JND estimation model directly generates a critical perceptual lossless (CPL) image from a top-down perspective and calculates the difference map between the original image and the CPL image as the final JND map. Given an input image, an adaptively critical point (perceptual lossless threshold), defined as the minimum number of spectral components in Karhunen-Loéve Transform (KLT) used for perceptual lossless image reconstruction, is derived by exploiting the convergence characteristics of KLT coefficient energy. Then, the CPL image can be reconstructed via inverse KLT according to the derived critical point. Finally, the difference map between the original image and the CPL image is calculated as the JND map. The performance of the proposed JND model is evaluated with two applications including JND-guided noise injection and JND-guided image compression. Experimental results have demonstrated that our proposed JND model can achieve better performance than several latest JND models.

show abstract

“…Despite these problems, multiple transform systems have been proposed for image and video compression. Just to mention some of the most recent ones, in [6] the authors explore data-driven transforms and propose a KLT based image compression algorithm with variable transform sizes. Also, different approaches propose to build graph-based separable transforms specifically for video coding [7,8].…”

Section: Introductionmentioning

confidence: 99%

Symmetry-Based Graph Fourier Transforms: Are They Optimal For Image Compression?

Gnutti

Guerrini

Leonardi

et al. 2021

2021 IEEE International Conference on Image Processing (ICIP)

View full text Add to dashboard Cite

Traditional block-based transforms are based on applying a single transform to all blocks. As an alternative, better performance in image and video processing and representation can be achieved by choosing one among a discrete set of transforms for each block. As an example, our recently proposed set of multiple transforms called Symmetry-Based Graph Fourier Transforms (SBGFTs) have shown good performance in terms of energy compaction, improving HEVC intra coding performance when used to replace the Discrete Cosine Transform (DCT). This paper further explores the performance of the SBGFTs in a multiple transforms, non-linear approximation perspective, by comparing them with two alternative sets of orthogonal transforms, namely, the Karhunen-Loève Transform (KLT) and the Sparse Orthonormal Transform (SOT). Experimental results confirm that SBGFTs achieve superior representation ability in this context as well, suggesting that they could assume a central role in image compression.

show abstract

Image Coding With Data-Driven Transforms: Methodology, Performance and Potential

Cited by 26 publications

References 46 publications

Neural Data-Dependent Transform for Learned Image Compression

Neural Data-Dependent Transform for Learned Image Compression

Towards Top-Down Just Noticeable Difference Estimation of Natural Images

Symmetry-Based Graph Fourier Transforms: Are They Optimal For Image Compression?

Contact Info

Product

Resources

About