2014
DOI: 10.5715/jnlp.21.485
|View full text |Cite
|
Sign up to set email alerts
|

Unlabeled Dependency Parsing Based Pre-reordering for Chinese-to-Japanese SMT

Abstract: In statistical machine translation, Chinese and Japanese is a well-known long-distance language pair that causes difficulties to word alignment techniques. Pre-reordering methods have been proven efficient and effective; however, they need reliable parsers to extract the syntactic structure of the source sentences. On one hand, we propose a framework in which only part-of-speech (POS) tags and unlabeled dependency parse trees are used to minimize the influence of parse errors, and linguistic knowledge on struc… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
9
0

Year Published

2015
2015
2023
2023

Publication Types

Select...
4
1
1

Relationship

1
5

Authors

Journals

citations
Cited by 6 publications
(9 citation statements)
references
References 26 publications
(46 reference statements)
0
9
0
Order By: Relevance
“…An example of such a gap is the alignment between the Chinese words "去(go to) 了(-ed)" and the Japanese words "行っ(go) た(-ed)". 3 In the training stage, these gaps in the matrix of alignments caused by long distance order differences may lead to IBM Model 4 to miss such word correspondences. Furthermore, the wrong alignment will causes problem during phrase extraction.…”
Section: The Word Alignment Problemmentioning
confidence: 99%
“…An example of such a gap is the alignment between the Chinese words "去(go to) 了(-ed)" and the Japanese words "行っ(go) た(-ed)". 3 In the training stage, these gaps in the matrix of alignments caused by long distance order differences may lead to IBM Model 4 to miss such word correspondences. Furthermore, the wrong alignment will causes problem during phrase extraction.…”
Section: The Word Alignment Problemmentioning
confidence: 99%
“…An effective technique to translate sentences between distant language pairs is pre-reordering, where words in sentences from the source language are re-arranged with the objective to resemble the word order of the target language. Rearranging rules are automatically extracted (Xia and McCord, 2004;Genzel, 2010), or linguistically motivated (Xu et al, 2009;Isozaki et al, 2010;Han et al, 2012;Han et al, 2013). We work following the latter strategy, where the source sentence is parsed to find its syntactical structure, and linguistically-motivated rules are used in combination with the structure of the sentence to guide the word reordering.…”
Section: Effects Of Parsing Errors On Pre-reordering Performancementioning
confidence: 99%
“…Since local reordering models which are integrated in phrase-based SMT systems do not perform well for distant language pairs due to their different syntactic structures, pre-reordering methods have been proposed to supply the need for improving the word alignment. Han et al (2013) described one of the latest pre-reordering methods (DPC) which was based on dependency parsing. The authors were using an unlabeled dependency parser to extract the syntactic information of Chinese sentences, and then by combining with part-of-speech (POS) tags 1 , they defined a set of heuristic reordering rules to guide the reordering.…”
Section: Reordering Modelmentioning
confidence: 99%
“…Error propagation is a common problem for many NLP tasks (Song et al, 2012;Quirk and CorstonOliver, 2006;Han et al, 2013;Gildea and Palmer, 2002;Yang and Cardie, 2013). It can occur when NLP tools applied early on in a pipeline make mistakes that have negative impact on higher-level tasks further down the pipeline.…”
Section: Introductionmentioning
confidence: 99%