2016
DOI: 10.1587/transinf.2016edl8080
|View full text |Cite
|
Sign up to set email alerts
|

A Morpheme-Based Weighting for Chinese-Mongolian Statistical Machine Translation

Abstract: SUMMARYIn this paper, a morpheme-based weighting and its integration method are proposed as a smoothing method to alleviate the data sparseness in Chinese-Mongolian statistical machine translation (SMT). Besides, we present source-side reordering as the pre-processing model to verify the extensibility of our method. Experi-mental results show that the morpheme-based weighting can substantially improve the translation quality.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2020
2020
2020
2020

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(4 citation statements)
references
References 8 publications
0
4
0
Order By: Relevance
“…is paper cuts in from two aspects of the reranking instance extraction algorithm and feature selection, aiming to solve the problem of maximum entropy training data imbalance. In the experiment, the statistical machine translation system [13] based on the maximum entropy ordering model will be used as the baseline system.…”
Section: Statistical Machine Translation Based On the Maximum Entropy Phrase Reordering Modelmentioning
confidence: 99%
See 1 more Smart Citation
“…is paper cuts in from two aspects of the reranking instance extraction algorithm and feature selection, aiming to solve the problem of maximum entropy training data imbalance. In the experiment, the statistical machine translation system [13] based on the maximum entropy ordering model will be used as the baseline system.…”
Section: Statistical Machine Translation Based On the Maximum Entropy Phrase Reordering Modelmentioning
confidence: 99%
“…In Table 3, Dir represents a word posterior probability feature based on a fixed position, Win represents a word posterior probability feature based on a sliding window, sliding window t � 2, Lev represents a word posterior probability feature based on Levenshtein alignment. When aligning the 1-best translation hypothesis in the Nbest list with other translation hypotheses, the open-source toolkit TER [13] is used, and its "shift" function is turned off, which is WER alignment. e abovementioned three posterior probabilities have been discretized before use [10].…”
Section: Classification Experiments Based On Word Posteriormentioning
confidence: 99%
“…The morphemes in a word indicate the basic word features and provide grammatical and semantic relations among words in the sentence. Mongolian morphological segmentation aims to split Mongolian words into their morphemes, which facilitates the Mongolian NLP tasks, such as name entity recognition (Wang et al, 2016;, information retrieval (Liu et al, 2012), machine translation (Fan et al, 2017;Yang et al, 2016), and speech synthesis (Liu et al, 2017). There are about 60 thousand of morphemes in Mongolian, and the number of their formed words is more than 7 million.…”
Section: Introductionmentioning
confidence: 99%
“…Many researchers have investigated the use of Mongolian morphological information to solve the data sparsity. Yang et al [8] used the morphology in the factored translation modle for Chinese-Mongolian SMT. However, factored model has the shortcoming of computing expensive and the translation quality is easily affected by the generation model.…”
Section: Introductionmentioning
confidence: 99%