2020
DOI: 10.14712/00326585.003
|View full text |Cite
|
Sign up to set email alerts
|

Universal Derivations 1.0, A Growing Collection of Harmonised Word-Formation Resources

Abstract: The aim of this paper is to open a discussion on harmonization of existing data resources related to derivational morphology. We present a newly assembled collection of eleven harmonized resources named "Universal Derivations" (clearly being inspired by the success story of the Universal Dependencies initiative in treebanking), as well as the harmonization process that brings the individual resources under a unified annotation scheme.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
8
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
5
1
1

Relationship

1
6

Authors

Journals

citations
Cited by 8 publications
(8 citation statements)
references
References 13 publications
0
8
0
Order By: Relevance
“…UDer (Kyjánek et al, 2020) is a collection of individual monolingual resources of derivational morphology. Most of them have been carefully evaluated against their own datasets and offer high quality.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…UDer (Kyjánek et al, 2020) is a collection of individual monolingual resources of derivational morphology. Most of them have been carefully evaluated against their own datasets and offer high quality.…”
Section: Discussionmentioning
confidence: 99%
“…More recently, several projects have followed the approach of formalizing and/or integrating existing morphological data for multiple languages. UDer (Universal Derivations) (Kyjánek et al, 2020) integrates 27 derivational morphology resources in 20 languages. UniMorph (Kirov et al, 2016 and the Wikinflection Corpus (Metheniti and Neumann, 2020) rely mostly on Wiktionary from which they extract inflectional information.…”
Section: State Of the Artmentioning
confidence: 99%
“…Universal Derivations (UDer, Kyjánek et al, 2019) proposes a unified scheme for derivational morphology. The Turkish part of the project uses EtymWordNet (de Melo and Weikum, 2010) as a resource.…”
Section: Lexical Resourcesmentioning
confidence: 99%
“…Compared to state-of-the-art derivational resources (Vidra et al, 2019;Kyjánek et al, 2019), this dataset provides explicit morphemes between source and target word forms. With these morphemes, subword tokenization (Sennrich et al, 2016;Mielke et al, 2021) can be advanced to dictionary-based morpheme segmentation for derivationally rich languages like English and French.…”
Section: Derivational Morphologymentioning
confidence: 99%