Generative Adversarial Network Including Referring Image Segmentation For Text-Guided Image Manipulation

Watanabe, Yuto; Togo, Ren; Maeda, Kazuo; Ogawa, Takahiro; Haseyama, Miki

doi:10.1109/icassp43922.2022.9746970

Cited by 7 publications

(5 citation statements)

References 15 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Note that this paper is an extension of a previously published paper [19]. The main difference between our previous study and this one is the introduction of the CLIP loss.…”

Section: Introductionmentioning

confidence: 85%

Text-Guided Image Manipulation via Generative Adversarial Network With Referring Image Segmentation-Based Guidance

et al. 2023

View full text Add to dashboard Cite

This study proposed a novel text-guided image manipulation method that introduces referring image segmentation into a generative adversarial network. The proposed text-guided image manipulation method aims to manipulate images containing multiple objects while preserving text-unrelated regions. The proposed method assigns the task of distinguishing between text-related and unrelated regions in an image to segmentation guidance based on referring image segmentation. With this architecture, the adversarial generative network can focus on generating new attributes according to the text description and reconstructing text-unrelated regions. For the challenging input images with multiple objects, the experimental results demonstrate that the proposed method outperforms conventional methods in terms of image manipulation precision.INDEX TERMS text-guided image manipulation, text-to-image synthesis, generative adversarial network, referring image segmentation.Previous text-guided image manipulation methods use mechanisms that can select the text-related region and at-

show abstract

“…Note that this paper is an extension of a previously published paper [19]. The main difference between our previous study and this one is the introduction of the CLIP loss.…”

Section: Introductionmentioning

confidence: 85%

Text-Guided Image Manipulation via Generative Adversarial Network With Referring Image Segmentation-Based Guidance

et al. 2023

View full text Add to dashboard Cite

show abstract

“…The model is trained on extracted features from training images, achieving an accuracy of 79% for 4k images and 99.5% for 51k images. Y. Watanabe, R. Togo [26] This paper introduces text-guided image manipulation, a concept where natural language descriptions are used to control image generation for user-friendly manipulation. Methods like CMPC-Refseg and text-guided feature exchange modules are proposed to semantically alter image appearance to meet user requirements, overcoming limitations of traditional image manipulation techniques.…”

Section: Literature Surveymentioning

confidence: 99%

Reverse Sign Language Recognition

Gaikwad

2024

IJSREM

View full text Add to dashboard Cite

Sign language is an important mode of communication for deaf, mute, and disabled people. Sign language is a language in which people communicate with the help of gestures in situations where they cannot speak or hear. Gestures are an effective method of human interaction and are often used by deaf people to communicate. This article describes various methods and techniques for recognizing text signatures in images. This review compares different methods and algorithms with the help of a pie chart. Key Words: communication, sign recognition, hand gestures, sign language

show abstract

“…The research on text-guided image editing was rapidly accelerated by the emergence of the generative adversarial network (GAN) [39]. Approaches to GAN-based text-guided image editing can be divided into two categories: (1) the approaches [10,11,[40][41][42][43] utilizing a unique network with a single or multi-stage architecture, and (2) the approaches [12,13,[44][45][46] leveraging the representation capabilities of a pretrained StyleGAN [16,47,48]. In approach (1), some studies [40,41] have applied an encoder-decoder architecture and successfully generated 64 × 64 resolution edited images on datasets such as Oxford-102 flower [49] and Caltech-UCSD Birds [50].…”

Section: Text-guided Image Editingmentioning

confidence: 99%

“…In approach (1), some studies [40,41] have applied an encoder-decoder architecture and successfully generated 64 × 64 resolution edited images on datasets such as Oxford-102 flower [49] and Caltech-UCSD Birds [50]. To generate high-resolution edited images on complex image dataset such as MSCOCO [51], several studies [10,11,42,43] construct a multi-stage architecture with a generator and discriminator at each stage. Three stages are trained at the same time, and progressively generate edited images of three different resolutions, i.e., 64 × 64 → 128 × 128 → 256 × 256.…”

Section: Text-guided Image Editingmentioning

confidence: 99%

Text-Guided Image Editing Based on Post Score for Gaining Attention on Social Media

Watanabe,

Togo,

Maeda

et al. 2024

Sensors

Self Cite

View full text Add to dashboard Cite

Text-guided image editing has been highlighted in the fields of computer vision and natural language processing in recent years. The approach takes an image and text prompt as input and aims to edit the image in accordance with the text prompt while preserving text-unrelated regions. The results of text-guided image editing differ depending on the way the text prompt is represented, even if it has the same meaning. It is up to the user to decide which result best matches the intended use of the edited image. This paper assumes a situation in which edited images are posted to social media and proposes a novel text-guided image editing method to help the edited images gain attention from a greater audience. In the proposed method, we apply the pre-trained text-guided image editing method and obtain multiple edited images from the multiple text prompts generated from a large language model. The proposed method leverages the novel model that predicts post scores representing engagement rates and selects one image that will gain the most attention from the audience on social media among these edited images. Subject experiments on a dataset of real Instagram posts demonstrate that the edited images of the proposed method accurately reflect the content of the text prompts and provide a positive impression to the audience on social media compared to those of previous text-guided image editing methods.

show abstract

Generative Adversarial Network Including Referring Image Segmentation For Text-Guided Image Manipulation

Cited by 7 publications

References 15 publications

Text-Guided Image Manipulation via Generative Adversarial Network With Referring Image Segmentation-Based Guidance

Text-Guided Image Manipulation via Generative Adversarial Network With Referring Image Segmentation-Based Guidance

Reverse Sign Language Recognition

Text-Guided Image Editing Based on Post Score for Gaining Attention on Social Media

Contact Info

Product

Resources

About