2018
DOI: 10.3233/jifs-169505
|View full text |Cite
|
Sign up to set email alerts
|

Towards the use of entropy as a measure for the reliability of automatic MT evaluation metrics

Abstract: The study describes an experiment with different estimations of reliability. Reliability reflects the technical quality of the measurement procedure such as an automatic evaluation of Machine Translation (MT). Reliability is an indicator of accuracy, the reliability of measuring, in our case, measuring the accuracy and error rate of MT output based on automatic metrics (precision, recall, f-measure, Bleu-n, WER, PER, and CDER). The experiment showed metrics (Bleu-4 and WER) that reduce the overall reliability … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
12
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
7
1

Relationship

3
5

Authors

Journals

citations
Cited by 13 publications
(12 citation statements)
references
References 16 publications
0
12
0
Order By: Relevance
“…The neural machine translation achieved a better translation quality due to the reference. Its translation was more accurate in both meaning and fluency; for example: (7) Source: Michel called on residents to "stay calm and cool-headed" as the investigation continued into Tuesday's police raid. mt@ec_SMT: Michel vyzval, aby zachovali pokoj a "cool-headed"ako prešetrovanie pokračovalo utorkového policajným zásahom.…”
Section: Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…The neural machine translation achieved a better translation quality due to the reference. Its translation was more accurate in both meaning and fluency; for example: (7) Source: Michel called on residents to "stay calm and cool-headed" as the investigation continued into Tuesday's police raid. mt@ec_SMT: Michel vyzval, aby zachovali pokoj a "cool-headed"ako prešetrovanie pokračovalo utorkového policajným zásahom.…”
Section: Discussionmentioning
confidence: 99%
“…They measure the closeness and/or compute the score of their lexical concordance. According to the criterion of lexical concordance, automatic metrics of MT evaluation can be divided into metrics of accuracy (matched n-grams) and metrics of error rate (edit distance) [7].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…The tags are of unequal length, but most tags follow the same structure for the same inflectional paradigm. The information of the part of speech is often more important than the structure or the length of the tags [26,27] and it is encoded in the first position of every tag. The second position usually marks an inflectional paradigm for words that have this category.…”
Section: Methodsmentioning
confidence: 99%
“…We focused on automatic metrics, but we also used the post-editing method, which is a manual evaluation of MT quality. According to the criterion of concordance, we selected automatic metrics based on lexical concordance and divided them into metrics of accuracy and metrics of error rate (Munk, Munkova & Benko, 2018).…”
Section: Automatic Evaluation Metricsmentioning
confidence: 99%