2023
DOI: 10.3390/math11041006
|View full text |Cite
|
Sign up to set email alerts
|

A Survey on Evaluation Metrics for Machine Translation

Abstract: The success of Transformer architecture has seen increased interest in machine translation (MT). The translation quality of neural network-based MT transcends that of translations derived using statistical methods. This growth in MT research has entailed the development of accurate automatic evaluation metrics that allow us to track the performance of MT. However, automatically evaluating and comparing MT systems is a challenging task. Several studies have shown that traditional metrics (e.g., BLEU, TER) show … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4

Citation Types

0
24
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 29 publications
(24 citation statements)
references
References 33 publications
0
24
0
Order By: Relevance
“…Recognizing the importance of evaluation methods is relevant to the study's assessment of attitudes toward TMS quality and performance. MT systems have applications in various domains, including the business and professional translation sectors (Lee et al ., 2023). Understanding these applications is fundamental to comprehending the role of TMS in the translation industry.…”
Section: Introductionmentioning
confidence: 99%
“…Recognizing the importance of evaluation methods is relevant to the study's assessment of attitudes toward TMS quality and performance. MT systems have applications in various domains, including the business and professional translation sectors (Lee et al ., 2023). Understanding these applications is fundamental to comprehending the role of TMS in the translation industry.…”
Section: Introductionmentioning
confidence: 99%
“…Blur criteria and scales for manual translation quality, along with different human evaluator sensitivity to translation errors may result in the judge subjectivity, which can be reflected in the poor consistency and instability of the evaluation results 5 . Human evaluation is an effective way to assess translation quality, but is challenging to find reliable bilingual annotators 6 . In addition to poor consistency and subjectivity, manual evaluation is both financially and time-consuming; however, unlike automatic evaluation, it does not require a reference translation.…”
Section: Introductionmentioning
confidence: 99%
“…The advantages of automatic evaluation lie in its objectivity, consistency, stability, speed, reusability, and language independence. It is cost-effective and easy to use for comparing multiply systems, but at the expense of quality 6 . Furthermore, automatic evaluation requires reference—human translation (gold standard)—since the evaluation is based on comparing MT output with reference translation.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…BLEU scores can also be used to track language learners' progress over time as learners strive to produce text that is increasingly similar to humangenerated text. BLEU is relatively simple to compute and widely used, but it has been criticized for not considering the meaning of the translated text or the context in which it is used (Lee et al, 2023). Therefore, using multiple metrics is important to get a complete picture.…”
mentioning
confidence: 99%