In this age of information technology, it has become possible for people all over the world to communicate in different languages through social media platforms with the help of machine translation (MT) systems. As far as the Arabic-English language pair is concerned, most studies have been conducted on evaluating the MT output for the standard varieties of Arabic, with fewer studies focusing on the vernacular or colloquial varieties. This study attempts to address this gap through presenting an evaluation of the performance of MT output for vernacular or colloquial Arabic in the social media domain. As it is currently the most widely used MT system, Google Translate (GT) has been chosen for evaluating the reliability of its output in the context of translating the Arabic colloquial language (i.e., Egyptian/Cairene Arabic variety) used in social media into English. With this goal in mind, a corpus consisting of Egyptian dialectal Arabic sentences were collected from social media networks, i.e., Facebook and Twitter, and then fed into GT system. The GT output was then evaluated by three human translators to assess their accuracy of translation in terms of adequacy and fluency. The results of the study show that several translation problems have been spotted for GT output. These problems are mainly concerned with wrong equivalents, inappropriate additions and deletions, and transliteration for out-of-vocabulary (OOV) words, which are mostly due to the literal translation of the Arabic vernacular sentences into English. This can be due to the fact that Arabic vernacular varieties are different from the standard language for which MT systems have been basically developed. This, consequently, necessitates the need to upgrade such MT systems to deal with the vernacular varieties.
Research has shown that parallel corpora have potential benefits for translator training and education. Most of the current available Arabic corpora, modern standard or dialectical, are monolingual in nature and there is an apparent lack in the Arabic-English parallel corpora for translation classroom. The present study was aimed to investigate the translation problems encountered by Omani translation major students when translating from Arabic into English with a view to proposing some corpus-informed pedagogy approach for training student translators to overcome these challenges by looking at some model samples of professional translation. Thirty students voluntarily took part in the investigation. The study adopted a combination of both corpus and qualitative methodology whereby some typical problems students would encounter when translating from Arabic-into-English were selected along with some specific Arabic texts involving these problems were prepared and the participants were asked to translate them into English. The participants were provided with some samples of the parallel English translated texts and were asked to compare and contrast their translations with these samples and reflect on the overall experience. They were then interviewed to explore their impressions about and the extent to which they think that parallel corpora would help them improve their translation. Results of data analysis indicated that the participants experienced several translation challenges. They, however, showed an overall positive attitude towards parallel Arabic-English corpora as they reportedly found them very helpful in improving their translation. Pedagogical implications for corpus-informed translation teaching, training and materials design and development are presented and discussed.
The present paper describes a machine translation (MT) course taught to undergraduate students in the Department of English Language and Literature at Dhofar University in Oman. The course is one of the major requirements for BA in Translation. Fifteen EFL translation students who were in their third year of study were enrolled in the course. The author presents both the theoretical and practical parts of the course. In the theoretical part, the topics covered in the course are outlined. As for the practical part, it focuses on the translation students’ post-editing of online MT output. This is beneficial to the students as free online MT systems can potentially be used as a means for improving student translators’ training and EFL learning. This is achieved through subjecting MT output to analysis or post-editing by the students so that they can focus on the differences between the source and target languages. With this goal in mind, assignments were given to the students to post-edit the Arabic and English MT output of three free online MT systems (Systran, Babylon and Google Translate), discuss the linguistic problems that they spot for each system and choose the one that has the fewest number of errors. The results show that the students, with varying degrees of success, have managed to identify some linguistic errors with the MT output for each MT system and thus produced a better human translation. The paper concludes that there is a need to incorporate MT courses in translation departments in the Arab world, as integrating technology into translation curricula will have great effect on student translators’ training for their future career as professional translators.
Some attempts have been made in the academic community to carry out an automatic morphological analysis of the Qur'anic text. Among the well-known endeavors in this regard is the morphological annotation of the Quranic Arabic Corpus (QAC) which was carried out in Leeds University, UK. In addition, researchers in the University of Haifa had previously implemented a computational system for the morphological analysis of the Qur'an. More recently, a new Quranic corpus has been built in Mohammed I University in Morocco. To the best of our knowledge, these are the only three studies to produce a morphologically analyzed part-of-speech tagged Qur'an encoded as a structured linguistic database. This paper surveys the morphological analysis in the above-mentioned annotation projects and compares between them to test the quality of their analysis using five criteria related to display of the text in the corpus, word segmentation, morphological disambiguation, part of speech (POS) tag set and manual verification. The paper concludes that the QAC of Leeds and the Quranic corpus of Morocco surpass the Quranic corpus of Haifa with regard to most of these criteria. Furthermore, some additional POS tags for derivative nouns are suggested in a step to reach a more fine-grained tag set that could be proposed for POS tagging of Qur'anic Arabic.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.