2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019
DOI: 10.1109/cvpr.2019.00649
|View full text |Cite
|
Sign up to set email alerts
|

StoryGAN: A Sequential Conditional GAN for Story Visualization

Abstract: We propose a new task, called Story Visualization. Given a multi-sentence paragraph, the story is visualized by generating a sequence of images, one for each sentence. In contrast to video generation, story visualization focuses less on the continuity in generated images (frames), but more on the global consistency across dynamic scenes and characters -a challenge that has not been addressed by any singleimage or video generation methods. We therefore propose a new story-to-image-sequence generation model, Sto… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
133
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 144 publications
(144 citation statements)
references
References 35 publications
0
133
0
Order By: Relevance
“…Progressive Manipulation A few state of the art approaches for text-to-image generation and image manipulation are expressly designed for conversational systems [5,8,9,21,30]. In Figure 5 we show that we can repeatedly apply our method to generated images to have a manipulation in multiples steps.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Progressive Manipulation A few state of the art approaches for text-to-image generation and image manipulation are expressly designed for conversational systems [5,8,9,21,30]. In Figure 5 we show that we can repeatedly apply our method to generated images to have a manipulation in multiples steps.…”
Section: Resultsmentioning
confidence: 99%
“…proposed StoryGAN [21], which generates a series of images that are contextually coherent with previously generated images and with the sequence of text descriptions provided by the user.…”
Section: Input Text Featuresmentioning
confidence: 99%
“…[39] allow users to input object instance masks into an existing image represented by a semantic layout. [40] generate images iteratively from consecutive textual commands, [41] provide interactive image editing based on a current image and instructions on how to update the image, and [42] generate individual images for a sequence of sentences. [43] do interactive image generation but do not use text as direct input but instead update a scene graph from text over the course of the interaction.…”
Section: Related Workmentioning
confidence: 99%
“…Dialogue based interaction is studied to control image synthesis, in order to improve complex scene generation progressively [219]- [223]. Meanwhile, text-to-image synthesis is extended to multiple images or videos, where visual consistency is required among the generated images [224]- [226].…”
Section: ) Other Topicsmentioning
confidence: 99%