2022
DOI: 10.20944/preprints202201.0018.v1
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Guidance to Pre-tokeniztion for SacreBLEU: Meta-Evaluation in Korean

Abstract: SacreBLEU, by incorporating a text normalizing step in the pipeline, has been well-received as an automatic evaluation metric in recent years. With agglutinative languages such as Korean, however, the metric cannot provide a conceivable result without the help of customized pre-tokenization. In this regard, this paper endeavors to examine the influence of diversified pre-tokenization schemes –word, morpheme, character, and subword– on the aforementioned metric by performing a meta-evaluati… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(3 citation statements)
references
References 11 publications
0
3
0
Order By: Relevance
“…Achieving a SacreBLEU score of 32.90 in the task of translating from Japanese to Ainu highlights the high quality of translations despite the Ainu language's complex polysynthetic structure Ortega et al [2020]. Furthermore, the accomplishment of a SacreBLEU score of 29.91 in the bi-directional translation task between Japanese and Ainu attests to the capability of the neural MT framework to handle intricate and resource-scarce languages efficiently Kim and Kim [2022b].…”
Section: Discussionmentioning
confidence: 89%
See 2 more Smart Citations
“…Achieving a SacreBLEU score of 32.90 in the task of translating from Japanese to Ainu highlights the high quality of translations despite the Ainu language's complex polysynthetic structure Ortega et al [2020]. Furthermore, the accomplishment of a SacreBLEU score of 29.91 in the bi-directional translation task between Japanese and Ainu attests to the capability of the neural MT framework to handle intricate and resource-scarce languages efficiently Kim and Kim [2022b].…”
Section: Discussionmentioning
confidence: 89%
“…The positive SacreBLEU scores garnered from our experiments highlight MT's potential utility in supporting language preservation and revitalization projects. The data indicates that even within the constraints of limited linguistic resources, MT models can achieve a degree of accuracy that renders them valuable tools for learners and scholars engaged with the Ainu language Kim and Kim [2022b].…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation