“…Non-autoregressive neural machine translation (Gu et al, 2018) aims to enable the parallel generation of output tokens without sacrificing translation quality. There has been a surge of recent interest in this family of efficient decoding models, resulting in the development of iterative refinement (Lee et al, 2018), CTC models (Libovicky and Helcl, 2018), insertion-based methods Chan et al, 2019b), editbased methods (Gu et al, 2019;Ruis et al, 2019), masked language models (Ghazvininejad et al, 2019(Ghazvininejad et al, , 2020b, and normalizing flow models (Ma et al, 2019). Some of these methods generate the output tokens in a constant number of steps (Gu et al, 2018;Libovicky and Helcl, 2018;Lee et al, 2018;Ghazvininejad et al, 2019Ghazvininejad et al, , 2020b, while others require a logarithmic number of generation steps Chan et al, 2019b,a;Li and Chan, 2019).…”