2011
DOI: 10.7202/1006182ar
|View full text |Cite
|
Sign up to set email alerts
|

Dutch Parallel Corpus: A Balanced Copyright-Cleared Parallel Corpus

Abstract: This paper presents the Dutch Parallel Corpus, a high-quality parallel corpus for Dutch, French and English consisting of more than ten million words. The corpus contains five different text types and is balanced with respect to text type and translation direction. All texts included in the corpus have been cleared from copyright. We discuss the importance of parallel corpora in various research domains and contrast the Dutch Parallel Corpus with existing parallel corpora. The Dutch Parallel Corpus distinguish… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
24
0
2

Year Published

2012
2012
2023
2023

Publication Types

Select...
5
2
1

Relationship

2
6

Authors

Journals

citations
Cited by 70 publications
(26 citation statements)
references
References 21 publications
0
24
0
2
Order By: Relevance
“…Chesterman 2004;Klaudy 2001;Olohan & Baker 2000;Øverås 1998). On the other hand, however, the article and the research that resulted from it were criticized as well (see e.g.…”
Section: Theoretical Background 21 Traditions In Translation Studiesmentioning
confidence: 97%
“…Chesterman 2004;Klaudy 2001;Olohan & Baker 2000;Øverås 1998). On the other hand, however, the article and the research that resulted from it were criticized as well (see e.g.…”
Section: Theoretical Background 21 Traditions In Translation Studiesmentioning
confidence: 97%
“…These manually created reference alignments can be used to develop or test automatic word alignment systems. For more information on the sub-sentential alignments, we refer to [17].…”
Section: Fig 111 Alignment Spot Checkmentioning
confidence: 99%
“…Table 11.1). For a detailed description of the DPC corpus design and text typology, we refer to [17,24].…”
Section: Balanced Corpus Designmentioning
confidence: 99%
“…The source sentences in this data set were extracted from three different text types of the Dutch Parallel Corpus (Macken et al, 2011). The translations in this data set were obtained from Google Translate 4 .…”
Section: Detecting Grammatical Errorsmentioning
confidence: 99%