Human rating of predicted post-editing effort is a common activity and has been used to train confidence estimation models. However, the correlation between human ratings and actual post-editing effort is under-measured. Moreover, the impact of presenting effort indicators in a post-editing user interface on actual post-editing effort has hardly been researched. In this study, ratings of perceived post-editing effort are tested for correlations with actual temporal, technical and cognitive post-editing effort. In addition, the impact on post-editing effort of the presentation of post-editing effort indicators in the user interface is also tested. The language pair involved in this study is English-Brazilian Portuguese. Our findings, based on a small sample, suggest that there is little agreement between raters for predicted post-editing effort and that the correlations between actual post-editing effort and predicted effort are only moderate, and thus an inefficient basis for MT confidence estimation. Moreover, the presentation of post-editing effort indicators in the user interface appears not to impact on actual postediting effort.
This paper reports on the results of a project that aimed to investigate the usability of raw machine translated technical support documentation for a commercial online file storage service. Adopting a user-centred approach, we utilize the ISO/TR 16982 definition of usability -goal completion, satisfaction, effectiveness, and efficiency -and apply eye-tracking measures shown to be reliable indicators of cognitive effort, along with a post-task questionnaire. We investigated these measures for the original user documentation written in English and in four target languages: Spanish, French, German and Japanese, all of which were translated using a freely available online statistical machine translation engine. Using native speakers for each language, we found several significant differences between the source and MT output, a finding that indicates a difference in usability between well-formed content and raw 2 machine translated content. One target language in particular, Japanese, was found to have a considerably lower usability level when compared with the original English.
Eye tracking has been used successfully as a technique for measuring cognitive load in reading, psycholinguistics, writing, language acquisition etc for some time now. Its application as a technique for automatically measuring the reading ease of MT output has not yet, to our knowledge, been tested. We report here on a preliminary study testing the use and validity of an eye tracking methodology as a means of semiand/or automatically evaluating machine translation output. 50 French machine translated sentences, 25 rated as excellent and 25 rated as poor in an earlier human evaluation, were selected. 10 native speakers of French were instructed to read the MT sentences for comprehensibility. Their eye gaze data were recorded noninvasively using a Tobii 1750 eye tracker. The average gaze time and fixation count were found to be higher for the "bad" sentences, while average fixation duration and pupil dilations were not found to be substantially different between output rated as good or bad. Comparisons between BLEU scores and eye gaze data were also made and found to correlate well with gaze time and fixation count, and to a lesser extent with pupil dilation and fixation duration. We conclude that the eye tracking data, in particular gaze time and fixation count, correlate reasonably well with human evaluation of MT output but fixation duration and pupil dilation may be less reliable indicators of reading difficulty for MT output. We also conclude that eye tracking has promise as an automatic MT Evaluation technique.
Machine Translation (MT) quality is generally measured via automatic metrics, producing scores that have no meaning for translators who are required to post-edit MT output or for project managers who have to plan and budget for translation projects. This paper investigates correlations between two such automatic metrics (General Text Matcher and Translation Edit Rate) and postediting productivity. For the purposes of this paper, productivity is measured via processing speed and cognitive measures of effort using eye tracking as a tool. Processing speed, average fixation time and count are found to correlate well with the scores for groups of segments. Segments with high GTM and TER scores require substantially less time and cognitive effort than medium or low-scoring segments. Future research involving score thresholds and confidence estimation is suggested.
In September 2015, the ADAPT Centre for Digital Content Technology carried out a focus group study of 70 translators at the European Commission’s Directorate-General for Translation (DGT). The aim was to better understand the factors involved in the translators’ adoption and non-adoption of machine translation (MT) during their translation tasks. Our analysis showed that, while broadly positive attitudes to MT could be observed, MT was not consistently adopted for all tasks. We argue that ergonomic factors related to a human translator’s needs, abilities, limitations, and overall well-being heavily impacted on participants’ decisions to use MT or not in their tasks. We further claim that it is only by taking into account the special institutional circumstances in which the activity of DGT translation is situated that these ergonomic factors can be fully understood and explained.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.