“…Early studies (Gamon et al, 2008;Tetreault et al, 2010;Dahlmeier and Ng, 2011;Berend et al, 2013;Rozovskaya and Roth, 2014) take GEC as a classification task and rely much on hand-crafted rules. More recently, the techniques of statistical machine translation and neural machine translation are applied to GEC and have made remarkable performance (Behera and Bhattacharyya, 2013;Junczys-Dowmunt and Grundkiewicz, 2016;Junczys-Dowmunt et al, 2018;Chollampatt and Ng, 2018;Zhao et al, 2019;Awasthi et al, 2019;Kiyono et al, 2019;Kaneko et al, 2020;Omelianchuk et al, 2020;Zhao and Wang, 2020).…”