2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2021
DOI: 10.1109/iccv48922.2021.00212
|View full text |Cite
|
Sign up to set email alerts
|

Language-Guided Global Image Editing via Cross-Modal Cyclic Mechanism

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
14
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
4
1

Relationship

0
10

Authors

Journals

citations
Cited by 19 publications
(14 citation statements)
references
References 15 publications
0
14
0
Order By: Relevance
“…faces), which is hard to train for less structured domains like full body images. Recently, Jiang et al [37] propose a new language-guided editing model specifically designed for performing global edits such as changing brightness or color tone of an image. Language based video manipulation.…”
Section: Related Workmentioning
confidence: 99%
“…faces), which is hard to train for less structured domains like full body images. Recently, Jiang et al [37] propose a new language-guided editing model specifically designed for performing global edits such as changing brightness or color tone of an image. Language based video manipulation.…”
Section: Related Workmentioning
confidence: 99%
“…Computational approaches [5,9,30] for modifying the style and appearance of objects in natural photographs have made remarkable progress, allowing beginner users to accomplish a wide range of editing effects. Nevertheless, it should be noted that prior text-based image manipulations work either did not allow for arbitrary text commands or image manipulation [5,25,34] or only allowed for modifications to the image's appearance properties or style [20,30]. Controlling the localization of modifications normally requires the user to draw a region to specify [1,2,5], which adds to the complexity of the operation.…”
Section: Introductionmentioning
confidence: 99%
“…T HERE are various active branches of image manipulation, such as style transfer [5], image translation [6], [7], and Text-Guided Image Manipulation (TGIM), by taking advantage of recent deep generative architectures such as GANs [8], VAEs [9], auto-regressive models [10] and diffusion models [11]. Particularly, the previous TGIM methods either operate some objects by text instructions [12]- [14], such as "adding" and "removing" in a simple toy scene, or manipulating the appearance of objects [15] or the style of the image [16], [17]. In this work, we are interested in a novel challenging task of entity-Level Text-Guided Image Manipulation (eL-TGIM), which is to manipulate the entities on a natural image given the text descriptions, as shown in Fig.…”
Section: Introductionmentioning
confidence: 99%