Language transliteration is one of the important area in natural language processing. Accurate transliteration of named entities plays an important role in the performance of machine translation and cross-language information retrieval processes. The transliteration model must be design in such a way that the phonetic structure of words should be preserve as closely as possible. We have developed hybrid (statistical +rules) approach based transliteration system of person names; from a person name written in Punjabi (Gurumukhi Script), the system produces its English (Roman Script) transliteration. Experiments have shown that the performance is sufficiently high. The overall accuracy of system comes out to be 95.23%.
In Statistical Machine Translation (SMT), there are many source words that can present different translations or senses. Word Sense Disambiguation (WSD) system is designed to determine which one of the senses of an ambiguous word is invoked in a particular context around the word. It is an intermediate task essential to many natural language processing problems, including machine translation, information retrieval and speech processing. There is not any cited work for resolving ambiguity of words in Myanmar language. This paper presents a new WSD method for ambiguous Myanmar words. It is based on supervised learning approach, Nearest Neighbor Cosine Classifier. The system uses Myanmar-English Parallel Corpus as a training resource. As an advantage, the system can overcome the problem of translation ambiguity from Myanmar to English language translation.
Reordering is one of the most challenging and important problems in Statistical Machine Translation. Without reordering capabilities, sentences can be translated correctly only in case when both languages implied in translation have a similar word order. When translating is between language pairs with high disparity in word order, word reordering is extremely desirable for translation accuracy improvement. Our Language, Myanmar is a verb final language and reordering is needed when our language is translated from other languages with different word orders. In this paper, automatic reordering rule generation and application of generated reordering rules in stochastic reordering model is presented. This work is intended to be incorporated into English-Myanmar Machine Translation system. In order to generate reordering rules; English-Myanmar parallel tagged aligned corpus is firstly created. Then reordering rules are generated automatically by using the linguistic information from this parallel tagged aligned corpus. In this paper, proposed function tag and part-of-speech tag reordering rule extraction algorithms are used to generate reordering rule automatically and First Order Markov theory is applied to implement stochastic reordering model.
This paper presents Myanmar phrases translation model with morphology analysis. The system is based on statistical approach. In statistical machine translation, large amount of information is needed to guide the translation process. When small amount of training data is available, morphological analysis is needed especially for morphology rich language. Myanmar language is inflected language and there are very few creations and researches of corpora in Myanmar, comparing to other language such as English, French, and Czech etc. Therefore, Myanmar phrases translation model is based on syntactic structure and morphology of Myanmar language. Bayes rule is also used to reformulate the translation probability of phrase pairs. Experiment results showed that proposed system can improve translation quality by applying morphological analysis on Myanmar language.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.