2013
DOI: 10.1007/s10579-013-9236-1
|View full text |Cite
|
Sign up to set email alerts
|

Dealing with orthographic variation in a tagger-lemmatizer for fourteenth century Dutch charters

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2014
2014
2022
2022

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 8 publications
(5 citation statements)
references
References 12 publications
0
5
0
Order By: Relevance
“…Distance measures are used to compare historical variants to modern lexicon entries [20,87,139]. Normalisation systems often combine several of these three approaches [1,12,139,180]. The fourth approach is statistical.…”
Section: Nlp Challengesmentioning
confidence: 99%
“…Distance measures are used to compare historical variants to modern lexicon entries [20,87,139]. Normalisation systems often combine several of these three approaches [1,12,139,180]. The fourth approach is statistical.…”
Section: Nlp Challengesmentioning
confidence: 99%
“…Both cga and cgl contain medieval Dutch material from the Gysseling corpus curated by the Institute for Dutch Lexicology 5 cga is a charter collection (administrative documents), whereas cgl concerns a variety of literary texts that greatly vary in length. crm is another Middle Dutch charter collection from the 14th century with wide geographic coverage (Van Reenen and Mulder, 1993;van Halteren and Rem, 2013). cgr, finally, is a smaller collection of samples from Middle Dutch religious writings that include later medieval texts (Kestemont et al, 2016).…”
Section: Datasetsmentioning
confidence: 99%
“…1300 AD. Secondly, we use the CRM-ADELHEID collection, a comparable collection of fourteenth century Middle Dutch charters, which has been the subject of a study comparable to ours (Van Halteren and Rem, 2013). As literary materials, we first of all use the literary counterpart of CG-ADMIN in the Corpus-Gysseling, a collection of Middle Dutch literary texts that all survive in manuscript copies predating ca.…”
Section: Data Setsmentioning
confidence: 99%