SketchyCOCO: Image Generation From Freehand Scene Sketches

Gao, Chengying; Liu, Qi; Xu, Qi; Wang, Limin; Liu, Jianzhuang; Zhang, Changqing

doi:10.1109/cvpr42600.2020.00522

Cited by 115 publications

(88 citation statements)

References 36 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Furthermore, even artificial sketch generators, when trained to create a sketch to convey the essence of an image with as few strokes as possible learn to first draw lines that have the most power to convey essential content [22]. In fact, there have recently been a number of artificial neural networks trained to generate sketches that are as easily recognizable as those generated by a human [23,24]. We here show that drawings that take advantage of the visual system's mechanisms for understanding scenes will be more easily interpreted [25].…”

Section: Discussionmentioning

confidence: 99%

Where to Draw The Line?

Sheng

Wilder

Walther

2021

Preprint

View full text Add to dashboard Cite

We often take people’s ability to understand and produce line drawings for granted. But where should we draw lines, and why? We address fundamental principles that underlie efficient representations of complex information in line drawings. First, 58 participants with varying degree of artistic experience produced multiple drawings of a small set of scenes by tracing contours on a digital tablet. Second, 37 independent observers ranked the drawings by how representative they are of the original photograph. Overall, artists’ drawings ranked higher than non-artists’. Matching contours between drawings of the same scene revealed that the most consistently drawn contours tend to be drawn earlier. We generated half-images with the most-versus least-consistently drawn contours by sorting contours by their consistency scores. Twenty five observers performed significantly better in a fast scene categorization task for the most compared to the least consistent half-images. The most consistent contours were longer and more likely to depict occlusion boundaries. Using psychophysics experiments and computational analysis, we confirmed quantitatively what makes certain contours in line drawings special: longer contours mark occlusion boundaries and aid rapid scene recognition. They allow artist and non-artists to convey important information starting from the first few strokes in their drawing process.

show abstract

Section: Discussionmentioning

confidence: 99%

Where to Draw The Line?

Sheng

Wilder

Walther

2021

Preprint

View full text Add to dashboard Cite

show abstract

“…It is measured by retrieving relevant text given an image query. For sketch-based image synthesis, classification accuracy is used to measure the realism of the synthesized objects [7,8] and how well the identities of synthesized results match those of real images [26]. Also, similarity between input sketches and edges of synthesized images can be measured to evaluate the correspondence between the input and output [8].…”

Section: How Do We Evaluate the Output Synthesized Images?mentioning

confidence: 99%

“…For sketch-based image synthesis, classification accuracy is used to measure the realism of the synthesized objects [7,8] and how well the identities of synthesized results match those of real images [26]. Also, similarity between input sketches and edges of synthesized images can be measured to evaluate the correspondence between the input and output [8]. In the scenario of pose-guided person image synthesis, "masked" versions of IS and SSIM, Mask-IS and Mask-SSIM are often used to ignore the effects of the background [27][28][29][30][31], since we want to focus on the synthesized human body.…”

Section: How Do We Evaluate the Output Synthesized Images?mentioning

confidence: 99%

“…An interesting next pursuit would be to see if computers can mimic creative processes such as those used by painters in making pictures, or assisting artists or architects in making artistic or architectural designs. In fact, in the past decade, we have witnessed advances in systems that synthesize an image from a text description [1][2][3][4] or from a learned style of content [5], paint a picture given a sketch [6][7][8][9], render a photorealistic scene from a wireframe [10,11], and create virtual reality content from images and videos [12], among others. A comprehensive review of such systems can explain the current state-of-the-art in such pursuits, reveal open challenges, and illuminate future directions.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Deep image synthesis from intuitive user input: A review and perspectives

Guo

Zhang

et al. 2021

Comp. Visual Media

View full text Add to dashboard Cite

In many applications of computer graphics, art, and design, it is desirable for a user to provide intuitive non-image input, such as text, sketch, stroke, graph, or layout, and have a computer system automatically generate photo-realistic images according to that input. While classically, works that allow such automatic image content generation have followed a framework of image retrieval and composition, recent advances in deep generative models such as generative adversarial networks (GANs), variational autoencoders (VAEs), and flow-based methods have enabled more powerful and versatile image generation approaches. This paper reviews recent works for image synthesis given intuitive user input, covering advances in input versatility, image generation methodology, benchmark datasets, and evaluation metrics. This motivates new perspectives on input representation and interactivity, cross fertilization between major image generation paradigms, and evaluation and comparison of generation methods.

show abstract

“…Besides, Sun et al [31] focused on the layout-to-image generation and proposed an intuitive paradigm to bridge the gap between input labels and generated images. Given a scene sketch, Gao et al [7] implemented a controllable image generation method to meet the specific requirements. Also, there are many generative models [18,9,28,25] that take the text as the input for multi-modal text-to-image generation.…”

Section: 2mentioning

confidence: 99%

Global-Affine and Local-Specific Generative Adversarial Network for semantic-guided image generation

Zhang¹,

Ni²,

Hou³

et al. 2021

MFC

View full text Add to dashboard Cite

The recent progress in learning image feature representations has opened the way for tasks such as label-to-image or text-to-image synthesis. However, one particular challenge widely observed in existing methods is the difficulty of synthesizing fine-grained textures and small-scale instances. In this paper, we propose a novel Global-Affine and Local-Specific Generative Adversarial Network (GALS-GAN) to explicitly construct global semantic layouts and learn distinct instance-level features. To achieve this, we adopt the graph convolutional network to calculate the instance locations and spatial relationships from scene graphs, which allows our model to obtain the highfidelity semantic layouts. Also, a local-specific generator, where we introduce the feature filtering mechanism to separately learn semantic maps for different categories, is utilized to disentangle and generate specific visual features. Moreover, we especially apply a weight map predictor to better combine the global and local pathways considering the highly complementary between these two generation sub-networks. Extensive experiments on the COCO-Stuff and Visual Genome datasets demonstrate the superior generation performance of our model against previous methods, our approach is more capable of capturing photo-realistic local characteristics and rendering small-sized entities with more details.

show abstract

SketchyCOCO: Image Generation From Freehand Scene Sketches

Cited by 115 publications

References 36 publications

Where to Draw The Line?

Where to Draw The Line?

Deep image synthesis from intuitive user input: A review and perspectives

Global-Affine and Local-Specific Generative Adversarial Network for semantic-guided image generation

Contact Info

Product

Resources

About