A matching technique in Example-Based Machine Translation

Cranias, Lambros; Papageorgiou, Harris; Piperidis, Stelios

doi:10.3115/991886.991901

Cited by 29 publications

(19 citation statements)

References 12 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…One obvious way in which we could enhance this implementation would be to use an N-gram index as proposed by Nagao and Mori (1994). Dynamic Programming (DP) techniques would undoubtedly lead to greater efficiency, as suggested by Cranias et al (1995Cranias et al ( , 1997 and also Planas and Furuse (this volume).…”

Section: Retrieval Speed Optimisationmentioning

confidence: 99%

The effects of word order and segmentation on translation retrieval performance

Baldwin

Hozumi

2000

Proceedings of the 18th Conference on Computational Linguistics -

View full text Add to dashboard Cite

This research looks at the effects of word order and segmentation on translation retrieval performance for an experimental Japanese-English translation memory system. We implement a number of both bag-of-words and word order-sensitive similarity metrics, and test each over characterbased and word-based indexing. The translation retrieval performance of each system configuration is evaluated empirically through the notion of word edit distance between translation candidate outputs and the model translation. Our results indicate that character-based indexing is consistently superior to word-based indexing, suggesting that segmentation is an unnecessary luxury in the given domain. Word order-sensitive approaches are demonstrated to generally outperform bag-of-words methods, with source language segment-level edit distance proving the most effective similarity metric.

show abstract

Section: Retrieval Speed Optimisationmentioning

confidence: 99%

The effects of word order and segmentation on translation retrieval performance

Baldwin

Hozumi

2000

Proceedings of the 18th Conference on Computational Linguistics -

View full text Add to dashboard Cite

show abstract

“…For example, given the constituent matchings depicted as solid lines in Figure 4, the dotted-line matchings corresponding to potential lexical translations would be ruled illegal. Crossing constraints are implicit in many phrasal matching approaches, both constituency-oriented (Kaji, Kida, & Morimoto 1992;Cranias, Papageorgiou, & Piperidis 1994;Grishman 1994) and dependency-oriented (Sadler & Vendelmans 1990;Matsumoto, Ishimoto, & Utsuro 1993). The theoretical cross-linguistic hypothesis here is that the core arguments of frames tend to stay together over different languages.…”

Section: Crossing Constraintsmentioning

confidence: 99%

“…Subsequently, the constituents of each sentence-pair are matched according to some heuristic procedure. A number of recent proposals can be cast in this framework (Sadler & Vendelmans 1990;Kaji, Kida, & Morimoto 1992;Matsumoto, Ishimoto, & Utsuro 1993;Cranias, Papageorgiou, & Piperidis 1994;Grishman 1994).…”

Section: Phrasal Alignmentmentioning

confidence: 99%

Bracketing and aligning words and constituents in parallel text using Stochastic Inversion Transduction Grammars

2000

Text, Speech and Language Technology

View full text Add to dashboard Cite

Abstract:We introduce (1) a novel stochastic inversion transduction grammar formalism for bilingual language modeling of sentence-pairs, and (2) the concept of bilingual parsing with a variety of parallel corpus analysis applications. Aside from the bilingual orientation, three major features distinguish the formalism from the finitestate transducers more traditionally found in computational linguistics: it skips directly to a context-free rather than finite-state base, it permits a minimal extra degree of ordering flexibility, and its probabilistic formulation admits an efficient maximum-likelihood bilingual parsing algorithm. A convenient normal form is shown to exist. Analysis of the formalism's expressiveness suggests that it is particularly well-suited to model ordering shifts between languages, balancing needed flexibility against complexity constraints. We discuss a number of examples of how stochastic inversion transduction grammars bring bilingual constraints to bear upon problematic corpus analysis tasks such as segmentation, bracketing, phrasal alignment, and parsing.

show abstract

“…Section 2.2)) and translated by combining already translated phrases stored in this lexicon, very much along the lines proposed originally by Becker, and applied by Schaler (1996). The use of such sub-sentential phrasal information enables EBMT systems to be particularly useful for capturing complex translation relations, such as idiomatic expressions, and as Cranias et al (1994) point out, the potential of EBMT relies on this ability to exploit smaller sub-sentential units. Phrases also lend themselves more easily to the matching and subsequent translation process while still minimizing the risk of increasing the level of ambiguity during both stages.…”

Section: Example-based Machine Translationmentioning

confidence: 99%