Proceedings of the Third Arabic Natural Language Processing Workshop 2017
DOI: 10.18653/v1/w17-1320
|View full text |Cite
|
Sign up to set email alerts
|

Universal Dependencies for Arabic

Abstract: We describe the process of creating NUDAR, a Universal Dependency treebank for Arabic. We present the conversion from the Penn Arabic Treebank to the Universal Dependency syntactic representation through an intermediate dependency representation. We discuss the challenges faced in the conversion of the trees, the decisions we made to solve them, and the validation of our conversion. We also present initial parsing results on NUDAR.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
19
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
5
2

Relationship

2
5

Authors

Journals

citations
Cited by 18 publications
(19 citation statements)
references
References 10 publications
0
19
0
Order By: Relevance
“…Table 2 illustrates the sample of the Corpora generated by UDPipe model. UDPipe model conducts (CoNLL-U) format files by performing tokenization, morphological analysis, POS tagging, lemmatization, and dependency parsing for nearly Universal Dependencies 2.5 [28], [29]. UDPipe was developed at Charles University in Prague [28], [30].…”
Section: Arabic Dependency Corporamentioning
confidence: 99%
“…Table 2 illustrates the sample of the Corpora generated by UDPipe model. UDPipe model conducts (CoNLL-U) format files by performing tokenization, morphological analysis, POS tagging, lemmatization, and dependency parsing for nearly Universal Dependencies 2.5 [28], [29]. UDPipe was developed at Charles University in Prague [28], [30].…”
Section: Arabic Dependency Corporamentioning
confidence: 99%
“…In this work, we do not include clitics as a part of the paradigms, as they heavily increase the size of the paradigms. We made the exception to add the Al determiner particle in order to be consistent with commonly used tokenizations for Arabic treebanks-Penn Arabic Treebank (Maamouri et al, 2004) and Arabic Universal Dependencies (Taji et al, 2017).…”
Section: Semitic: Classical Syriacmentioning
confidence: 99%
“…Marton et al (2013) explored the use several morpho-syntactic features in the easy-first framework, while Shahrour et al (2015;Shahrour et al (2016) used MaltParser (Nivre et al, 2006). Taji et al (2017) presented the UD treebank more recently and conducted experiments on CATiB and UD separately in a single-task settings. Multitask systems that have been developed for Arabic were part of efforts to build one multilingual system for all UD dependencies.…”
Section: Related Workmentioning
confidence: 99%
“…The first is the Columbia Arabic Treebank (CATiB) representation (Habash and Roth, 2009), which is inspired by Arabic traditional grammar and which focus on modeling syntactic and morpho-syntactic agreement and case assignment. The second is the Universal Dependency (UD) representation (Taji et al, 2017), which has relatively more focus on semantic/thematic relations, and which is coordinated in design with a number of other languages (Nivre et al, 2017). While previous work on Arabic dependency parsing (Marton et al, 2013;Taji et al, 2017) tackled these formalisms separately, we argue that they stand to benefit from multitask learning (MTL) (Caruana, 1993).…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation