Generative Models for Low-Dimensional Video Representation and Reconstruction

Hyder, Rakib; Asif, M. Salman

doi:10.1109/tsp.2020.2977256

Cited by 19 publications

(15 citation statements)

References 37 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We consider a specific objectindependent angular sampling order for time-sequential sampling of the projections for this model and analyze factors affecting uniqueness and stability of the solution. ProSep does not use any spatial prior for the object, but in numerical experiments shows performance superior to the recently proposed GMLR [13] -a deep image prior model for video. We expect that combining a spatial image prior with ProSep will improve its performance even further.…”

Section: Introductionmentioning

confidence: 93%

Dynamic Tomography Reconstruction by Projection-Domain Separable Modeling

Iskender¹,

Klasky²,

Bresler³

2022

Preprint

View full text Add to dashboard Cite

In dynamic tomography the object undergoes changes while projections are being acquired sequentially in time. The resulting inconsistent set of projections cannot be used directly to reconstruct an object corresponding to a time instant. Instead, the objective is to reconstruct a spatio-temporal representation of the object, which can be displayed as a movie. We analyze conditions for unique and stable solution of this ill-posed inverse problem, and present a recovery algorithm, validating it experimentally. We compare our approach to one based on the recently proposed GMLR variation on deep prior for video, demonstrating the advantages of the proposed approach.

show abstract

Section: Introductionmentioning

confidence: 93%

Dynamic Tomography Reconstruction by Projection-Domain Separable Modeling

Iskender¹,

Klasky²,

Bresler³

2022

Preprint

View full text Add to dashboard Cite

show abstract

“…This enables us to obtain a controlled environment of diverse video generation from learned latent vectors for each video in the given dataset, while maintaining almost uniform quality. In addition, the proposed approach also allows a concise video data representation in form of learned vectors, frame interpolation (using a low rank constraint introduced in [12]), and generation of videos unseen during the learning paradigm.…”

Section: Mocogan (Adversarial)mentioning

confidence: 99%

“…The objective of video frame interpolation is to synthesize non-existent frames in-between the reference frames. While the triplet condition ensures that similar frames have their transient latent vectors nearby, it doesn't ensure that they lie on a manifold where simple linear interpolation will yield latent vectors that generate frames with plausible motion compared to preceding and succeeding frames [4,12]. This means that the transient latent subspace can be represented in a much lower dimensional space compared to its larger ambient space.…”

Section: Low Rank Representation For Interpolationmentioning

confidence: 99%

“…This means that the transient latent subspace can be represented in a much lower dimensional space compared to its larger ambient space. So, to enforce such a property, we project the latent vectors into a low dimensional space while learning them along with the network weights, first proposed in [12]. Mathematically, the loss in (6) can be written as…”

Section: Low Rank Representation For Interpolationmentioning

confidence: 99%

See 1 more Smart Citation

Non-Adversarial Video Synthesis with Learned Priors

Aich¹,

Gupta²,

Panda³

et al. 2020

Preprint

Self Cite

View full text Add to dashboard Cite

Most of the existing works in video synthesis focus on generating videos using adversarial learning. Despite their success, these methods often require input reference frame or fail to generate diverse videos from the given data distribution, with little to no uniformity in the quality of videos that can be generated. Different from these methods, we focus on the problem of generating videos from latent noise vectors, without any reference input frames. To this end, we develop a novel approach that jointly optimizes the input latent space, the weights of a recurrent neural network and a generator through non-adversarial learning. Optimizing for the input latent space along with the network weights allows us to generate videos in a controlled environment, i.e., we can faithfully generate all videos the model has seen during the learning process as well as new unseen videos. Extensive experiments on three challenging and diverse datasets well demonstrate that our proposed approach generates superior quality videos compared to the existing state-of-the-art methods.

show abstract

“…In another line of work, untrained convolutional network architectures have been used as image prior. Deep image prior (DIP) [15] and its variants [16,17] utilize the structural bias of convolutional networks towards producing natural images [18] in fewer update iterations compared to modeling noise. Using x = G(z, θ) where G(z, θ) is a generator network using latent code, z and network weights θ, we can write the DIP prior as…”

Section: Introductionmentioning

confidence: 99%

A Consensus Equilibrium Solution For Deep Image Prior Powered By Red

Hyder

Mansour

et al. 2021

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Self Cite

View full text Add to dashboard Cite

Recent advances in solving imaging inverse problems have witnessed the combination of deep learning models with classical image models for better signal representation. One such approach, DeepRED, combines the deep image prior (DIP) with the regularization by denoising (RED) framework to boost the performance of image deblurring and super resolution tasks. In this paper, we formulate DeepRED as a consensus equilibrium problem and set up a fixed-point algorithm for solving the equilibrium equations. We also derive sufficient conditions that the DIP generative prior should satisfy to ensure that the corresponding fixed-point operator is nonexpansive. We then demonstrate that the fixed-point algorithm that solves the CE equations results in improved image reconstruction quality in a deblurring setting compared to state-of-the-art methods.

show abstract

Generative Models for Low-Dimensional Video Representation and Reconstruction

Cited by 19 publications

References 37 publications

Dynamic Tomography Reconstruction by Projection-Domain Separable Modeling

Dynamic Tomography Reconstruction by Projection-Domain Separable Modeling

Non-Adversarial Video Synthesis with Learned Priors

A Consensus Equilibrium Solution For Deep Image Prior Powered By Red

Contact Info

Product

Resources

About