Proceedings of the 15th Conference of the European Chapter of The Association for Computational Linguistics: Volume 1 2017
DOI: 10.18653/v1/e17-1034
|View full text |Cite
|
Sign up to set email alerts
|

Universal Dependencies and Morphology for Hungarian - and on the Price of Universality

Abstract: In this paper, we present how the principles of universal dependencies and morphology have been adapted to Hungarian. We report the most challenging grammatical phenomena and our solutions to those. On the basis of the adapted guidelines, we have converted and manually corrected 1,800 sentences from the Szeged Treebank to universal dependency format. We also introduce experiments on this manually annotated corpus for evaluating automatic conversion and the added value of language-specific, i.e. non-universal, … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(3 citation statements)
references
References 6 publications
0
3
0
Order By: Relevance
“…Several different notation systems have been developed to provide the mentioned annotation, including UD (Universal Dependencies) [14] which aim is to provide a general-purpose language independent solution. Among others there is a Hungarian corpus available within the UD project [15]. Also, there is a much larger Hungarian text corpus that is following the file format that was defined in the UD project, however the annotation label, the morphological description and the POS tag set are different.…”
Section: Survey On Dependency Parsingmentioning
confidence: 99%
“…Several different notation systems have been developed to provide the mentioned annotation, including UD (Universal Dependencies) [14] which aim is to provide a general-purpose language independent solution. Among others there is a Hungarian corpus available within the UD project [15]. Also, there is a much larger Hungarian text corpus that is following the file format that was defined in the UD project, however the annotation label, the morphological description and the POS tag set are different.…”
Section: Survey On Dependency Parsingmentioning
confidence: 99%
“…Formerly in e-magyar, the model behind emDep was trained on POS tags and morphosyntactic features converted by DepTool. Consequently, the model had to be replaced with one trained on Szeged Treebank with UD tags (Vincze et al, 2017).…”
Section: Emmorph2udmentioning
confidence: 99%
“…As a starting point, we chose the Hungarian subcorpus (Vincze et al, 2017) of the Universal Dependencies (UD) corpus (Nivre et al, 2016) consisting of 1800 sentences (42000 tokens) of mainly newswire text in order to put the annotation schema we propose in a context that can be interpreted at an international level. The UD corpus contains texts in many languages annotated with morphosyntactic and dependency-based syntactic analysis using unified principles and categories.…”
Section: The Corpusmentioning
confidence: 99%