Proceedings of the Second Workshop on Universal Dependencies (UDW 2018) 2018
DOI: 10.18653/v1/w18-6002
|View full text |Cite
|
Sign up to set email alerts
|

Using Universal Dependencies in cross-linguistic complexity research

Abstract: We evaluate corpus-based measures of linguistic complexity obtained using Universal Dependencies (UD) treebanks. We propose a method of estimating robustness of the complexity values obtained using a given measure and a given treebank. The results indicate that measures of syntactic complexity might be on average less robust than those of morphological complexity. We also estimate the validity of complexity measures by comparing the results for very similar languages and checking for unexpected differences. We… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
16
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
4

Relationship

3
6

Authors

Journals

citations
Cited by 19 publications
(17 citation statements)
references
References 14 publications
(14 reference statements)
1
16
0
Order By: Relevance
“…Third, TTR correlates quite well both with other, more advanced corpus-based measures and with manually compiled grammar-based measures of complexity (Bentz et al, 2016;Kettunen, 2014). Fourth, it performs no worse than average in a recent comparison of several corpus-based complexity measures (Berdicevskis et al, 2018).…”
Section: Measuring Complexitymentioning
confidence: 81%
“…Third, TTR correlates quite well both with other, more advanced corpus-based measures and with manually compiled grammar-based measures of complexity (Bentz et al, 2016;Kettunen, 2014). Fourth, it performs no worse than average in a recent comparison of several corpus-based complexity measures (Berdicevskis et al, 2018).…”
Section: Measuring Complexitymentioning
confidence: 81%
“…The factors evaluated in the above, from Nivre et al (2007); Van Asch and Daelemans (2010); Mc-Donald and Nivre (2011); Nivre and Fang (2017); Coltekin and Rama (2018); Berdicevskis et al (2018), were already discussed. A few other factors have been pointed at in the literature that were not applicable to our experiments: Søgaard and Haulrich (2010) show that the perplexity of the derivation orders of a transition-based dependency parser, is also predictive of parser performance.…”
Section: Related Workmentioning
confidence: 99%
“…POS bigram perplexity Others have proposed to use the perplexity of a POS bigram language model trained on the treebank's training section and applied to its test section, to predict parser performance (Coltekin and Rama, 2018;Berdicevskis et al, 2018).…”
Section: Factorsmentioning
confidence: 99%
“…DLM is a constraint on the size of the whole flux of a sentence and therefore a particular case of constraints on the complexity of the flux. DLM is neither the only metrics for syntactic complexity (see Lewis (1996) for several constituency-based metrics; Berdicevskis et al 2018), nor the only metrics on the complexity of the flux and perhaps not the best. We will present other potentially interesting fluxbased metrics.…”
Section: Dlm-related Constraintsmentioning
confidence: 99%