2019 IEEE/CVF International Conference on Computer Vision (ICCV) 2019
DOI: 10.1109/iccv.2019.00713
|View full text |Cite
|
Sign up to set email alerts
|

Video Compression With Rate-Distortion Autoencoders

Abstract: In this paper we present a a deep generative model for lossy video compression. We employ a model that consists of a 3D autoencoder with a discrete latent space and an autoregressive prior used for entropy coding. Both autoencoder and prior are trained jointly to minimize a ratedistortion loss, which is closely related to the ELBO used in variational autoencoders. Despite its simplicity, we find that our method outperforms the state-of-the-art learned video compression networks based on motion compensation or … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
167
0

Year Published

2020
2020
2021
2021

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 193 publications
(167 citation statements)
references
References 26 publications
0
167
0
Order By: Relevance
“…Recent studies in learned image compression, e.g., [2,3], show the great potential of deep learning for improving the rate-distortion performance. It is therefore not surprising to see increasing interest in compressing video with Deep Neural Networks (DNNs) [8,38,9,22,13]. For example, Lu et al [22] proposed using optical flow for motion compensation and applying auto-encoders to compress the flow and residual.…”
Section: Introductionmentioning
confidence: 99%
“…Recent studies in learned image compression, e.g., [2,3], show the great potential of deep learning for improving the rate-distortion performance. It is therefore not surprising to see increasing interest in compressing video with Deep Neural Networks (DNNs) [8,38,9,22,13]. For example, Lu et al [22] proposed using optical flow for motion compensation and applying auto-encoders to compress the flow and residual.…”
Section: Introductionmentioning
confidence: 99%
“…Instead of the current video coding standard (HEVC), the upcoming video coding standard called Versatile Video Coding (VVC) [28] can also be combined with our method. We also intend to investigate other coding methods including 3D HEVC [7,10], and learned video coding methods [19][20][21][22] to clarify the strength of each method and to evaluate the potential synergy with our method. Finally, we hypothesize that generating basis images is also effective for other applications such as coding of high-speed videos.…”
Section: Resultsmentioning
confidence: 99%
“…The experimental results show that this method achieves better rate-distortion performance than other HEVC-based LF coding methods [12][13][14]18]. Reflecting the recent explosive prevalence of deep neural networks, learning-based video coding methods [19][20][21][22] have attracted much interest. However, these methods are still struggling to outperform the sophisticated video codecs.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…[30], however, we use pointclouds for the front layers and a closed mesh for the backmost layer. Furthermore, instead of directly accessing and optimizing depth information, we adapt ideas from autoencoders [9] to store scene depths in a neural network. Our network compresses the required depth information and offers an efficient way to optimize for temporal and spatial smoothness.…”
Section: Related Workmentioning
confidence: 99%