“…Earlier work in CSC focus mainly on unsupervised methods such as language model with a pre-constructed confusionset Yu and Li, 2014). Subsequently, some work cast CSC as a sequential labeling problem, in which conditional random fields (CRF) (Lafferty et al, 2001), gated recurrent networks (Hochreiter and Schmidhuber, 1997;Chung et al, 2014) have been employed to model the problem (Zheng et al, 2016;Xie et al, 2017;Wu et al, 2018). More recently, motivated by a serials of remarkable suc-cess achieved by neural network-based sequenceto-sequence learning (Seq2Seq) in various natural language processing (NLP) tasks (Sutskever et al, 2014;, generative models have also been applied to the spelling check task by considering it as an encoder-decoder (Xie et al, 2016;Ge et al, 2018).…”