2023
DOI: 10.3390/e25101469
|View full text |Cite
|
Sign up to set email alerts
|

Diffusion Probabilistic Modeling for Video Generation

Ruihan Yang,
Prakhar Srivastava,
Stephan Mandt

Abstract: Denoising diffusion probabilistic models are a promising new class of generative models that mark a milestone in high-quality image generation. This paper showcases their ability to sequentially generate video, surpassing prior methods in perceptual and probabilistic forecasting metrics. We propose an autoregressive, end-to-end optimized video diffusion model inspired by recent advances in neural video compression. The model successively generates future frames by correcting a deterministic next-frame predicti… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
5
5

Relationship

0
10

Authors

Journals

citations
Cited by 53 publications
(5 citation statements)
references
References 66 publications
0
5
0
Order By: Relevance
“…Models based on generative adversarial networks (GANs) [42][43][44] and variational autoencoders (VAEs) [45,46] have achieved excellent results in synthetic image generation. Nonetheless, there have recently been significant advancements in static image generation [47,48] and video generation [27,[49][50][51] using diffusion models, even exceeding the performance of GAN and VAE models [27,52].…”
Section: Methodsmentioning
confidence: 99%
“…Models based on generative adversarial networks (GANs) [42][43][44] and variational autoencoders (VAEs) [45,46] have achieved excellent results in synthetic image generation. Nonetheless, there have recently been significant advancements in static image generation [47,48] and video generation [27,[49][50][51] using diffusion models, even exceeding the performance of GAN and VAE models [27,52].…”
Section: Methodsmentioning
confidence: 99%
“…The state-of-the-art works of GAN-based approaches were examined by Tyagi and Yadav [23], Frolov et al [1], Zhou et al [24], and Tan et al [25]. On the other hand, on the diffusion models field, with multiple works [20], [21], [26], reviewing the progress of diffusion models in all fields, some articles explore deeply into particular areas, including audio diffusion models [27], diffusion models for video generation [28], diffusion model in vision [29], and text-to-image diffusion models [30], providing a thorough overview of the diffusion model field, as well as an in-depth look at its applications, limitations, and promising future possibilities. However, our work distinctively integrates the latest advancements in both GANs and diffusion models, providing a holistic view of the field.…”
Section: Related Surveys and Study Contributionmentioning
confidence: 99%
“…With its strong generative performance, the diffusion model is the new SOTA among current deep generative models. Works such as [ 152 – 154 ] demonstrated that the diffusion model has a strong ability to understand complex scenarios, and the video diffusion model can generate higher-quality videos. Works such as [ 155 , 156 ] utilized the diffusion model to generate complex and diverse driving scenarios.…”
Section: Prediction Of Autonomous Driving Based On World Modelsmentioning
confidence: 99%