“…Building a strong paraphrase generation system usually requires massive amounts of high-quality annotated paraphrase pairs, but existing labeled datasets (Lin et al, 2014;Fader et al, 2013;Lan et al, 2017) are either of small sizes or restricted in narrow domains. To avoid such a heavy reliance on labeled datasets, recent works have explored unsupervised methods (Li et al, 2018b;Fu et al, 2019;Siddique et al, 2020) to generate paraphrase without annotated training data, among which the backtranslation based model is an archetype Sokolov and Filimonov, 2020). It borrows the idea of back-translation (BT) in machine translation (Sennrich et al, 2016) where the model first translates a sentence s 1 into another sentence s 2 in a different language (e.g., En→Fr), and then translates s 2 back to s 1 .…”