2018
DOI: 10.48550/arxiv.1809.10238
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

C4Synth: Cross-Caption Cycle-Consistent Text-to-Image Synthesis

Abstract: Generating an image from its description is a challenging task worth solving because of its numerous practical applications ranging from image editing to virtual reality. All existing methods use one single caption to generate a plausible image. A single caption by itself, can be limited, and may not be able to capture the variety of concepts and behavior that may be present in the image. We propose two deep generative models that generate an image by making use of multiple captions describing it. This is achi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 21 publications
0
1
0
Order By: Relevance
“…Many existing methods ignore the use of multiple captions, where a single caption is limited and hardly contains the image concepts. C4Synth [162] addressed this by proposing a new cross-caption cycle-consistency model and a recurrent variant of it, inspired by CycleGAN [190]. The model follows a consistent hierarchy of text-image-text by predicting the caption from the generated image and matching it with the succeeding caption from multiple captions.…”
Section: Supervised T2imentioning
confidence: 99%
“…Many existing methods ignore the use of multiple captions, where a single caption is limited and hardly contains the image concepts. C4Synth [162] addressed this by proposing a new cross-caption cycle-consistency model and a recurrent variant of it, inspired by CycleGAN [190]. The model follows a consistent hierarchy of text-image-text by predicting the caption from the generated image and matching it with the succeeding caption from multiple captions.…”
Section: Supervised T2imentioning
confidence: 99%