“…Several techniques have been proposed to alleviate the accuracy degradation, including 1) knowledge distillation (Oord et al, 2017;Gu et al, 2017;Guo et al, 2019a,b;, 2) imposing source-target alignment constraint with fertility (Gu et al, 2017), word mapping (Guo et al, 2019a), attention distillation (Li et al, 2019b) and duration prediction . With the help of those techniques, it is observed that NAR models can match the accuracy of AR models for some tasks , but the gap still exists for some other tasks (Gu et al, 2017;.…”