Johannes Ballé scite author profile

Abstract-We introduce a general framework for end-to-end optimization of the rate-distortion performance of nonlinear transform codes assuming scalar quantization. The framework can be used to optimize any differentiable pair of analysis and synthesis transforms in combination with any differentiable perceptual metric. As an example, we consider a code built from a linear transform followed by a form of multi-dimensional local gain control. Distortion is measured with a state-of-theart perceptual metric. When optimized over a large database of images, this representation offers substantial improvements in bitrate and perceptual appearance over fixed (DCT) codes, and over linear transform codes optimized for mean squared error.

show abstract

Scale-Space Flow for End-to-End Optimized Video Compression

Agustsson

et al. 2020

View full text Add to dashboard Cite

Nonlinear Transform Coding

Ballé

Chou

Minnen

et al. 2021

IEEE J. Sel. Top. Signal Process.

165

View full text Add to dashboard Cite

We review a class of methods that can be collected under the name nonlinear transform coding (NTC), which over the past few years have become competitive with the best linear transform codecs for images, and have superseded them in terms of rate-distortion performance under established perceptual quality metrics such as MS-SSIM. We assess the empirical rate-distortion performance of NTC with the help of simple example sources, for which the optimal performance of a vector quantizer is easier to estimate than with natural data sources. To this end, we introduce a novel variant of entropy-constrained vector quantization. We provide an analysis of various forms of stochastic optimization techniques for NTC models; review architectures of transforms based on artificial neural networks, as well as learned entropy models; and provide a direct comparison of a number of methods to parameterize the rate-distortion trade-off of nonlinear transforms, introducing a simplified one.

show abstract

Perceptual image quality assessment using a normalized Laplacian pyramid

Laparra¹,

Ballé²,

Berardino³

et al. 2016

134

View full text Add to dashboard Cite

We present an image quality metric based on the transformations associated with the early visual system: local luminance subtraction and local gain control. Images are decomposed using a Laplacian pyramid, which subtracts a local estimate of the mean luminance at multiple scales. Each pyramid coefficient is then divided by a local estimate of amplitude (weighted sum of absolute values of neighbors), where the weights are optimized for prediction of amplitude using (undistorted) images from a separate database. We define the quality of a distorted image, relative to its undistorted original, as the root mean squared error in this "normalized Laplacian" domain. We show that both luminance subtraction and amplitude division stages lead to significant reductions in redundancy relative to the original image pixels. We also show that the resulting quality metric provides a better account of human perceptual judgements than either MS-SSIM or a recently-published gain-control metric based on oriented filters.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Johannes Ballé

End-to-end optimization of nonlinear transform codes for perceptual quality

Scale-Space Flow for End-to-End Optimized Video Compression

Nonlinear Transform Coding

Perceptual image quality assessment using a normalized Laplacian pyramid

Contact Info

Product

Resources

About