CASMACAT is a modular, web-based translation workbench that offers advanced functionalities for computer-aided translation and the scientific study of human translation: automatic interaction with machine translation (MT) engines and translation memories (TM) to obtain raw translations or close TM matches for conventional post-editing; interactive translation prediction based on an MT engine's search graph, detailed recording and replay of edit actions and translator's gaze (the latter via eye-tracking), and the support of e-pen as an alternative input device.The system is open source sofware and interfaces with multiple MT systems.
The main goal of this thesis is to develop computer assisted translation and machine translation systems which present a more robust synergy with their potential users. Hence, the main purpose is to make current state-of-the-art systems more ergonomic, intuitive and efficient, so that the human expert feels more comfortable when using them. For doing this, different techniques are presented, focusing on improving the adaptability and response time of the underlying statistical machine translation systems, as well as a strategy aiming at enhancing human-machine interaction within an interactive machine translation setup. All of this with the ultimate purpose of filling in the existing gap between the state of the art in machine translation and the final tools that are usually available for the final human translators.Concerning the response time of the machine translation systems, a parameter pruning technique is presented, whose intuition stems from the concept of bilingual segmentation, but which evolves towards a full parameter re-estimation strategy. By using such strategy, experimental results presented here prove that it is possible to achieve reductions of up to 97% in the number of parameters required without a significant loss in translation quality. Being robust across different language pairs, these results evidence that the pruning technique presented is effective in a traditional machine translation scenario, and could be used for instance in a post-editing setup. Nevertheless, experiments carried out within a simulated interactive machine translation environment are slightly less convincing, since a trade-off between response time and translation quality is needed.Two orthogonally different approaches are presented with the purpose of increasing the adaptability of the statistical machine translation systems. On the one hand, we investigate how to increase the adaptability of the language model, by subdividing it into several smaller language models which are then interpolated in translation time according to the source sentence to be translated. The specific sub-models are built either by taking advantage of supervised information present in certain bilingual corpora, or by performing unsupervised clustering on the training set, with the aim of uncovering specific sub-topics or language vii styles present. On the other hand, Bayesian predictive adaptation is elucidated as an efficient strategy for adapting the translation models present in state-of-the-art machine translation systems. Although adaptation experiments are only performed within the traditional machine translation framework, the results obtained are compelling enough for implementing them within an interactive setup, and such work will be done in the near future. Nevertheless, it should be noted that the techniques developed may be readily implemented within a computer assisted translation scenario, in which a statistical machine translation system is providing the translations that the user needs to modify and validate.Finally...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.