Unleashing Transformers: Parallel Token Prediction with Discrete Absorbing Diffusion for Fast High-Resolution Image Generation from Vector-Quantized Codes

Bond-Taylor, Sam; Hessey, Peter; Sasaki, Hiroshi; Breckon, Toby P.; Willcocks, Chris G.

doi:10.48550/arxiv.2111.12701

Search citation statements

Order By: Relevance

Paper Sections

Select...

Introduction1

Citation Types

Supporting

Mentioning

Contrasting

Year Published

2022

Publication Types

Select...

Other1

Relationship

Self Cite0

Independent1

Authors

Journals

Cited by 1 publication

(1 citation statement)

References 37 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…While diffusion models have shown impressive results on generation, editing, and other tasks (see Section 2), their main drawback is their long inference times, due to the iterative diffusion process that is applied at the pixel level to generate each result. Some recent works Gu et al 2021;Esser et al 2021b;Bond-Taylor et al 2021;] have thus proposed to perform the diffusion on a latent space with lower dimensionality and higher-level semantics, compared to pixels, yielding competitive performance on various tasks with much lower training and inference times. In particular, Latent Diffusion Models (LDM) ] offer this appealing combination of competitive image quality with fast inference times, however, this approach targets text-to-image generation from scratch, rather than global image manipulation, let alone local editing.…”

Section: Introductionmentioning

confidence: 99%