2021
DOI: 10.1145/3469721
|View full text |Cite
|
Sign up to set email alerts
|

Low Resource Neural Machine Translation: Assamese to/from Other Indo-Aryan (Indic) Languages

Abstract: Machine translation (MT) systems have been built using numerous different techniques for bridging the language barriers. These techniques are broadly categorized into approaches like Statistical Machine Translation (SMT) and Neural Machine Translation (NMT). End-to-end NMT systems significantly outperform SMT in translation quality on many language pairs, especially those with the adequate parallel corpus. We report comparative experiments on baseline MT systems for Assamese to other Indo-Aryan languages (in b… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 9 publications
(3 citation statements)
references
References 64 publications
0
3
0
Order By: Relevance
“…However, with standard definition, interpretation can be completed in 50,000 steps, as shown in Figure 4. Due to the influence of parameter initialization before machine translation model training, the parameters of largescale Chinese-English translation model belonging to the same translation task are introduced into the initialization of low-resource Chinese-English and Tibetan-Chinese translation models so that the model has a certain parameter basis before training, so its learning rate will be improved during retraining [24,25]. In this document, the encoder and decoder parameters of the Chinese-English translation model are initialized together with the parameters of the Chinese encoder of the Sino-English model and the decoder of the English-Chinese model.…”
Section: Experiments and Analysismentioning
confidence: 99%
“…However, with standard definition, interpretation can be completed in 50,000 steps, as shown in Figure 4. Due to the influence of parameter initialization before machine translation model training, the parameters of largescale Chinese-English translation model belonging to the same translation task are introduced into the initialization of low-resource Chinese-English and Tibetan-Chinese translation models so that the model has a certain parameter basis before training, so its learning rate will be improved during retraining [24,25]. In this document, the encoder and decoder parameters of the Chinese-English translation model are initialized together with the parameters of the Chinese encoder of the Sino-English model and the decoder of the English-Chinese model.…”
Section: Experiments and Analysismentioning
confidence: 99%
“…Assamese exhibits a subject-object-verb (SOV) word order, in contrast to the subject-verb-object (SVO) word order found in English. Additionally, it is characterized as an agglutinative language, as discussed by Sarma et al (2017) and Baruah et al (2021), signifying its propensity to incorporate suffixes and prefixes into words to convey diverse grammatical meanings. This intricacy poses a notable challenge for machine translation systems, as they must accurately analyze and generate these intricate word forms.…”
Section: Assamesementioning
confidence: 99%
“…MT researchers have introduced several approaches to overcome this bottleneck such as data augmentation using back-translation (Sennrich et al, 2016a), multilingual approach (Singh and Singh, 2022a), semi-supervised approach (Cheng et al, 2016; Singh andSingh, 2022b) and exploiting cues from multiple modalities (Gain et al, 2021; Meetei et al, 2023. There are also reports of a comparative study of MT systems on the low resource machine translation focusing on Indian languages such as Assamese (Baruah et al, 2021) and Mizo (Devi et al, 2022; Thangkhanhau andHussain, 2023).…”
Section: Introductionmentioning
confidence: 99%