Proceedings of the Ninth Workshop on Statistical Machine Translation 2014
DOI: 10.3115/v1/w14-3319
|View full text |Cite
|
Sign up to set email alerts
|

Abu-MaTran at WMT 2014 Translation Task: Two-step Data Selection and RBMT-Style Synthetic Rules

Abstract: This paper presents the machine translation systems submitted by the Abu-MaTran project to the WMT 2014 translation task. The language pair concerned is English-French with a focus on French as the target language. The French to English translation direction is also considered, based on the word alignment computed in the other direction. Large language and translation models are built using all the datasets provided by the shared task organisers, as well as the monolingual data from LDC. To build the translati… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2014
2014
2016
2016

Publication Types

Select...
2
1

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 9 publications
0
3
0
Order By: Relevance
“…Dublin City University / Lingo24 (wu et al, 2014) EU-BRIDGE EU-BRIDGE Project (Freitag et al, 2014) KIT Karlsruhe Institute of Technology (Herrmann et al, 2014) IIT-BOMBAY IIT Bombay (Dungarwal et al, 2014) IIIT-HYDERABAD (Rubino et al, 2014) PROMT-RULE, PROMT-HYBRID PROMT RWTH RWTH Aachen (Peitz et al, 2014) STANFORD Stanford University (Neidert et al, 2014;Green et al, 2014) UA-* University of Alicante (Sánchez-Cartagena et al, 2014) UEDIN-PHRASE, UEDIN-UNCNSTR University of Edinburgh (Durrani et al, 2014b) UEDIN-SYNTAX University of Edinburgh (Williams et al, 2014) UU, UU-DOCENT Uppsala University (Hardmeier et al, 2014) YANDEX Yandex School of Data Analysis (Borisov and Galinskaya, 2014…”
Section: Dcu-lingo24mentioning
confidence: 99%
“…Dublin City University / Lingo24 (wu et al, 2014) EU-BRIDGE EU-BRIDGE Project (Freitag et al, 2014) KIT Karlsruhe Institute of Technology (Herrmann et al, 2014) IIT-BOMBAY IIT Bombay (Dungarwal et al, 2014) IIIT-HYDERABAD (Rubino et al, 2014) PROMT-RULE, PROMT-HYBRID PROMT RWTH RWTH Aachen (Peitz et al, 2014) STANFORD Stanford University (Neidert et al, 2014;Green et al, 2014) UA-* University of Alicante (Sánchez-Cartagena et al, 2014) UEDIN-PHRASE, UEDIN-UNCNSTR University of Edinburgh (Durrani et al, 2014b) UEDIN-SYNTAX University of Edinburgh (Williams et al, 2014) UU, UU-DOCENT Uppsala University (Hardmeier et al, 2014) YANDEX Yandex School of Data Analysis (Borisov and Galinskaya, 2014…”
Section: Dcu-lingo24mentioning
confidence: 99%
“…University of Stuttgart / University of Munich (Quernheim and Cap, 2014) (Do et al, 2014) MANAWI-* Universität des Saarlandes (Tan and Pal, 2014) MATRAN Abu-MaTran Project: Prompsit / DCU / UA (Rubino et al, 2014) PROMT-RULE, PROMT-HYBRID PROMT RWTH RWTH Aachen STANFORD Stanford University (Neidert et al, 2014;Green et al, 2014) UA-* University of Alicante UEDIN-PHRASE, UEDIN-UNCNSTR University of Edinburgh (Durrani et al, 2014b) UEDIN-SYNTAX University of Edinburgh UU, UU-DOCENT Uppsala University (Hardmeier et al, 2014) Y-SDA Yandex School of Data Analysis (Borisov and Galinskaya, 2014) COMMERCIAL- [1,2] Two commercial machine translation systems ONLINE-[A,B,C,G] Four online statistical machine translation systems 4] Two rule-based statistical machine translation systems Table 2: Participants in the shared translation task. Not all teams participated in all language pairs.…”
Section: Ims-tttmentioning
confidence: 99%
“…Since we will build MT systems for both directions (Croatian to English and English to Croatian), we need monolingual corpora to train LMs for both target languages. For the Croatian-to-English direction, we used the data provided for the WMT14 translation task (Bojar et al 2014), 30 as described in our system submission to that shared task (Rubino et al 2014). For the English-to-Croatian direction, we used the target side of the general-domain parallel corpora (hrenWaC, SETimes and TED Talks) and hrWaC 2.0 (Ljubešić and Klubička 2014), a monolingual Croatian corpus crawled from the .hr top-level domain following the procedure described in Sect.…”
Section: Data Setsmentioning
confidence: 99%