Proceedings of the 27th ACM International Conference on Multimedia 2019
DOI: 10.1145/3343031.3350929
|View full text |Cite
|
Sign up to set email alerts
|

Editing Text in the Wild

Abstract: In this paper, we are interested in editing text in natural images, which aims to replace or modify a word in the source image with another one while maintaining its realistic look. This task is challenging, as the styles of both background and text need to be preserved so that the edited image is visually indistinguishable from the source image. Specifically, we propose an end-to-end trainable style retention network (SRNet) that consists of three modules: text conversion module, background inpainting module … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
71
0

Year Published

2020
2020
2021
2021

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 110 publications
(72 citation statements)
references
References 36 publications
1
71
0
Order By: Relevance
“…Our best model is currently capable of reaching a per image accuracy of 0.697, which is comparable to the accuracy of 0.824 achieved by the model on raw data. We outperform the work of [1], which achieves a per image accuracy of 0.517. The values of the validation proxy metrics could be seen in Table 3.…”
Section: Resultsmentioning
confidence: 88%
See 4 more Smart Citations
“…Our best model is currently capable of reaching a per image accuracy of 0.697, which is comparable to the accuracy of 0.824 achieved by the model on raw data. We outperform the work of [1], which achieves a per image accuracy of 0.517. The values of the validation proxy metrics could be seen in Table 3.…”
Section: Resultsmentioning
confidence: 88%
“…The task of the fusion module is to fuse the conversion module results and the background inpainting module results. The key difference is that [1] uses a text skeleton based conversion module, while [24] uses a control points based one. Also the modules of [1] are learned independently while [24] proposes a fully differentiable architecture.…”
Section: Realistic Text Replacementmentioning
confidence: 99%
See 3 more Smart Citations