Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions - ACL '07 2007
DOI: 10.3115/1557769.1557821
|View full text |Cite
|
Sign up to set email alerts
|

Moses

Abstract: We describe an open-source toolkit for statistical machine translation whose novel contributions are (a) support for linguistically motivated factors, (b) confusion network decoding, and (c) efficient data formats for translation models and language models. In addition to the SMT decoder, the toolkit also includes a wide variety of tools for training, tuning and applying the system to many translation tasks.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
219
0
2

Year Published

2010
2010
2023
2023

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 2,296 publications
(221 citation statements)
references
References 9 publications
0
219
0
2
Order By: Relevance
“…The pairs were aligned using GIZA++ and the phrase extractor and scorer from the Moses machine translation package (Koehn et al, 2007) . To apply a machine translation analogy, we treated words as sentences and the letters from which were constructed as tokens.…”
Section: Arabizi To Arabicmentioning
confidence: 99%
“…The pairs were aligned using GIZA++ and the phrase extractor and scorer from the Moses machine translation package (Koehn et al, 2007) . To apply a machine translation analogy, we treated words as sentences and the letters from which were constructed as tokens.…”
Section: Arabizi To Arabicmentioning
confidence: 99%
“…We use mteval from the Moses toolkit (Koehn et al, 2007) and TERCom to evaluate our systems on the BLEU (Papineni et al, 2002) and TER (Snover et al, 2006) measures. Additional we use BEER (Stanojević and Sima'an, 2014) and CTER (Wang et al, 2016).…”
Section: Smt Systemsmentioning
confidence: 99%
“…We used beam search with a beam width of 8 to approximately find the most likely translations given a source sentence before introducing features proposed by our language models and reranking with the default Moses (Koehn et al, 2007) implementation of K-best MIRA (Cherry and Foster, 2012). Both language models were trained on the English news data.…”
Section: Neural Baselinementioning
confidence: 99%
“…Additionally, we backtranslated a subset of these sentences and used the resulting source-target sentences to augment our training data. Our training and development data were lowercased and preprocessed using the Moses tokenizer script (Koehn et al, 2007), Jieba, and BPE. We set the upper bound on the target vocabulary to 30, 000 sub-words and two additional tokens reserved for EOS and U N K .…”
Section: Corpora and Preprocessingmentioning
confidence: 99%