2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019
DOI: 10.1109/cvpr.2019.00385
|View full text |Cite
|
Sign up to set email alerts
|

Video Generation From Single Semantic Label Map

Abstract: This paper proposes the novel task of video generation conditioned on a SINGLE semantic label map, which provides a good balance between flexibility and quality in the generation process. Different from typical end-to-end approaches, which model both scene content and dynamics in a single step, we propose to decompose this difficult task into two sub-problems. As current image generation methods do better than video generation in terms of detail, we synthesize high quality content by only generating the first … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
59
0

Year Published

2020
2020
2021
2021

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 97 publications
(59 citation statements)
references
References 38 publications
0
59
0
Order By: Relevance
“…Trajectory Prediction We use the fused feature z cat,i of the i-th agent to predict its future trajectories (in multi-modality case). Similar to [6], we employ three deconvolutional layers with stride 2 to upsample z cat and obtain the feature representation r of size R 64×H×W , which can be written as, r = Decode pnet (z cat ) . Another issue is that the future behavior of an agent follows a multimodal distribution.…”
Section: The Fine Prediction Networkmentioning
confidence: 99%
See 3 more Smart Citations
“…Trajectory Prediction We use the fused feature z cat,i of the i-th agent to predict its future trajectories (in multi-modality case). Similar to [6], we employ three deconvolutional layers with stride 2 to upsample z cat and obtain the feature representation r of size R 64×H×W , which can be written as, r = Decode pnet (z cat ) . Another issue is that the future behavior of an agent follows a multimodal distribution.…”
Section: The Fine Prediction Networkmentioning
confidence: 99%
“…All experiments are conducted on a machine with an NVIDIA Tesla V100 GPU. Evaluation Metrics Similar to prior work [6,2,10], two metrics are used to evaluate the compared methods. They are Average Displacement Error (ADE): the mean Euclidean distance between predicted and the realistic trajectory and Final Displacement Error (FDE): the Euclidean distance between the final location of the predicted trajectory and the ground truth.…”
Section: Baselines and Setupsmentioning
confidence: 99%
See 2 more Smart Citations
“…PredRNN++ [41] proposed Causal LSTMs with cascaded dual memories to model short-term video. Pan et al [42] proposed to generate video frames from a single semantic map. Wang et al [43] developed a point-to-point network which generates intermediate frames given the start and end frames.…”
Section: Related Workmentioning
confidence: 99%