2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021
DOI: 10.1109/cvpr46437.2021.00373
|View full text |Cite
|
Sign up to set email alerts
|

LayoutTransformer: Scene Layout Generation with Conceptual and Spatial Diversity

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
19
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
2
2

Relationship

0
9

Authors

Journals

citations
Cited by 28 publications
(19 citation statements)
references
References 14 publications
0
19
0
Order By: Relevance
“…Some work can directly generate the layout based on user input text information and further generate images based on layouts. Wu et al (Wu et al 2023) address the infidelity issues by imposing spatialtemporal attention control based on the pixel regions of each object predicted by a LayoutTransformer (Yang et al 2021). However, their algorithm is time-consuming, with each generation taking around 10 minutes.…”
Section: Related Workmentioning
confidence: 99%
“…Some work can directly generate the layout based on user input text information and further generate images based on layouts. Wu et al (Wu et al 2023) address the infidelity issues by imposing spatialtemporal attention control based on the pixel regions of each object predicted by a LayoutTransformer (Yang et al 2021). However, their algorithm is time-consuming, with each generation taking around 10 minutes.…”
Section: Related Workmentioning
confidence: 99%
“…For sequence generation, the input order is an important factor. We compare the AR and NAR models with different element orders used in previous works (Yang et al 2021;Kong et al 2022): (1) position, where elements are sorted using the top-left coordinates. While most previous works follow this setting, it actually causes information leak of the ground truth data during inference since they use absolution positions in real layouts to determine the order; (2) category, where input elements are fed per category (e.g., generate all the paragraph elements first, then followed by table ).…”
Section: Non-autoregressive Decoding Analysismentioning
confidence: 99%
“…Layout generation refers to the arrangement of elements (i.e., size and position) on a canvas, which is essential for creating visually appealing graphic designs (e.g., articles, user interface). State-of-the-art systems (Jyothi et al 2019;Arroyo, Postels, and Tombari 2021;Kikuchi et al 2021;Yang et al 2021) mostly view the task as a sequence generation problem where the sequence is composed of element attribute tokens (i.e., category, position, size). Majority of the works follow the autoregressive (AR) approach which generates one token at a time based on the previously output and have achieved promising results (Yang et al 2021;Guo, Huang, and Xie 2021;Yang et al 2023;Weng et al 2023).…”
Section: Introductionmentioning
confidence: 99%
“…In particular, we use the Transformer model [Vaswani et al 2017], which has an inbuilt self-attention mechanism, that has been applied to natural language processing applications. Such auto-regressive models have also been used to generate images [Oord et al 2016], sketches [Ribeiro et al 2020], geometry [Nash et al 2020], and layout [Yang et al 2021]. However, the structure of material graphs, with their arbitrary number of nodes, varying number of input and output edges, and functional constraints on these edges, makes material graph generation significantly more challenging than generating text, images, or meshes.…”
Section: Related Workmentioning
confidence: 99%