“…Morpheme-based segmentation for Korean has been proved beneficial. Many downstream applications for Korean language processing, such as POS tagging (Jung, Lee, and Hwang 2018;Park and Tyers 2019), phrase-structure parsing (Choi, Park, and Choi 2012;Park, Hong, and Cha 2016;Kim and Park 2022), and machine translation Cha, 2016, 2017b), are based on c https://github.com/KimByoungjae/klpNER2017 d https://github.com/naver/nlp-challenge/tree/master/missions/ner the morpheme-based segmentation, in which all morphemes are separated from each other. In these studies, the morpheme-based segmentation is implemented to avoid data sparsity because the number of possible words in longer segmentation granularity (such as eojeols) can be exponential given the characteristics of Korean, an agglutinative language.…”