2021
DOI: 10.48550/arxiv.2107.13136
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Insights from Generative Modeling for Neural Video Compression

Abstract: While recent machine learning research has revealed connections between deep generative models such as VAEs and rate-distortion losses used in learned compression, most of this work has focused on images. In a similar spirit, we view recently proposed neural video coding algorithms through the lens of deep autoregressive and latent variable modeling. We present recent neural video codecs as instances of a generalized stochastic temporal autoregressive transform, and propose new avenues for further improvements… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
4
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(5 citation statements)
references
References 39 publications
1
4
0
Order By: Relevance
“…We conjecture that, since the predicted next frame may be already motioncompensated, the resulting residual is sparser and, hence, easier to capture by the diffusion model. Similar observations have been made in neural video compression [9,26].…”
Section: Ablation Studiessupporting
confidence: 83%
See 3 more Smart Citations
“…We conjecture that, since the predicted next frame may be already motioncompensated, the resulting residual is sparser and, hence, easier to capture by the diffusion model. Similar observations have been made in neural video compression [9,26].…”
Section: Ablation Studiessupporting
confidence: 83%
“…Some of these models show impressive rate-distortion performance with hierarchical structures that separately encode the prediction and error residual. Although compression models have different goals from generative models, both benefit from predictive sequential priors [26]. Note, however, that these models are ill-suited for generation since compression models typically have a small spatio-temporal context and are constructed to preserve local information rather than to generalize [12].…”
Section: Neural Video Compression Modelsmentioning
confidence: 99%
See 2 more Smart Citations
“…Digitization in business and science raises the demand for data storage, and the proliferation of remote working arrangements makes high-quality videoconferencing indispensable. Recently, novel compression methods that employ probabilistic machine-learning models have been shown to outperform more traditional compression methods for images and videos , Yang et al, 2020a, Agustsson et al, 2020, Yang et al, 2021. Machine learning provides new methods for the declarative task of expressing complex probabilistic models, which are essential for data compression (see Section 2.1).…”
Section: Introductionmentioning
confidence: 99%