2021
DOI: 10.48550/arxiv.2105.10026
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Improving Generation and Evaluation of Visual Stories via Semantic Consistency

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
16
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(16 citation statements)
references
References 0 publications
0
16
0
Order By: Relevance
“…There are limited number of methods working on the same story visualization task as ours: StoryGAN [14], CP-CSV [24], DUCO [17], and VLC [16]. StoryGAN is based on generative adversarial networks, and CP-CSV, DUCO, and VLC are built on top of StoryGAN, where CP-CSV relies on character segmentation masks to provide an additional supervision, and DUCO and VLC utilized auxiliary captioning networks to keep consistency.…”
Section: Methodsmentioning
confidence: 99%
See 3 more Smart Citations
“…There are limited number of methods working on the same story visualization task as ours: StoryGAN [14], CP-CSV [24], DUCO [17], and VLC [16]. StoryGAN is based on generative adversarial networks, and CP-CSV, DUCO, and VLC are built on top of StoryGAN, where CP-CSV relies on character segmentation masks to provide an additional supervision, and DUCO and VLC utilized auxiliary captioning networks to keep consistency.…”
Section: Methodsmentioning
confidence: 99%
“…CP-CSV [24] was built on StoryGAN, and utilized segmentation masks to improve the character consistency. DUCO [17] and VLC [16] also adopted StoryGAN as the backbone, and added auxiliary captioning networks to build a text-image-text circle. However, all these methods still struggle with the quality of output images, and fail to generate fine-grained regional details.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…This is different from video since it lacks continuous frame prediction, having a temporal relation to show a smooth motion transition. However, the literature reports only a few studies on the story-generation task, most of which are retrieval-based [301][302][303], while some pay attention to GAN models [31,231,233,235,237,239] and almost none of the other generative models are explored, except [230].…”
Section: Story (Consistent)mentioning
confidence: 99%