2018
DOI: 10.1007/978-3-030-01204-5_4
|View full text |Cite
|
Sign up to set email alerts
|

Lemmatization for Ancient Languages: Rules or Neural Networks?

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
12
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(12 citation statements)
references
References 18 publications
0
12
0
Order By: Relevance
“…Initial evaluation of the POS tagging of the 1882-1926 segment of the corpus pointed to F-scores 33 ranging from 91-96% (Uí Dhonnchadha et al 2014). Dereza (2018), who discusses lemmatisation approaches for ancient and morphologically complex languages, reports that neither rule-based approaches 32 Code available at: https://github.com/kscanne/caighdean/ [accessed 10 March 2020]. 33 A measure of a test's accuracy that incorporates "precision" (e.g.…”
Section: Natural Language Processing Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…Initial evaluation of the POS tagging of the 1882-1926 segment of the corpus pointed to F-scores 33 ranging from 91-96% (Uí Dhonnchadha et al 2014). Dereza (2018), who discusses lemmatisation approaches for ancient and morphologically complex languages, reports that neither rule-based approaches 32 Code available at: https://github.com/kscanne/caighdean/ [accessed 10 March 2020]. 33 A measure of a test's accuracy that incorporates "precision" (e.g.…”
Section: Natural Language Processing Methodsmentioning
confidence: 99%
“…Fransen carefully weighs the advantages of different approaches in order to ensure the applicability of his analyser. He also envisions a fully functioning POS-tagger suitable for both Old and Middle Irish by making some suggestions for allowing interoperability of resources, especially between his morphological analyser and Dereza's (2018) Old Irish lemmatiser.…”
Section: Description Of Partmentioning
confidence: 99%
See 1 more Smart Citation
“…Examples of variants are: igreja matris/igreja matriz, parochia/parrochia. This pre-processing task is very challenging (Baron & Rayson, 2008;Dereza, 2018). There are some tools for classic languages, like CLTK: the classical language toolkit, 14 but concerning the classic period of the Portuguese language, existing tools 15 still need to be trained for this period.…”
Section: (42) Applicationsmentioning
confidence: 99%
“…For instance, Eger et al (2016) evaluates different pre-existing models on a dataset of German and Medieval Latin, and Dereza (2018) focuses on Early Irish. The most similar to the present paper in this area is work by Kestemont et al (2016), which tackled lemmatization of Middle Dutch with a neural encoder that extracts character and word-level features from a fixed-length token window and predicts the target lemma from a closed-set of true lemmas.…”
Section: Related Workmentioning
confidence: 99%