Universal dependencies (UD) is a framework for morphosyntactic annotation of human language, which to date has been used to create treebanks for more than 100 languages. In this article, we outline the linguistic theory of the UD framework, which draws on a long tradition of typologically oriented grammatical theories. Grammatical relations between words are centrally used to explain how predicate–argument structures are encoded morphosyntactically in different languages while morphological features and part-of-speech classes give the properties of words. We argue that this theory is a good basis for cross-linguistically consistent annotation of typologically diverse languages in a way that supports computational natural language understanding as well as broader linguistic studies.
In spite of the current availability of large collections of treebanks that can be used and queried from one common place on the web, we are still far from achieving a real interconnection, both between treebanks themselves and with other (kinds of) linguistic resources. However, making resources interoperable is a crucial requirement to maximize the contribution of each single resource, as well as to account for the linguistic complexity of the texts provided by (annotated) corpora and particularly by treebanks. This paper describes how dependency treebanks are interlinked in a Knowledge Base of linguistic resources for Latin based on Linked Open Data practices and standards. The Knowledge base is built to make linguistic resources interact by integrating all types of annotation applied to a particular word/text into a common representation.
Assuming that collaboration between theoretical and computational linguistics is essential in projects aimed at developing language resources like annotated corpora, this paper presents the first steps of the semantic annotation of the Index Thomisticus Treebank, a dependency-based treebank of Medieval Latin. The semantic layer of annotation of the treebank is detailed and the theoretical framework supporting the annotation style is explained and motivated.
In Ancient Greek, as well as in other languages, whenever agreement is triggered by two or more coordinated phrases, two different constructions are allowed: either the agreement can be controlled by the coordinated phrase as a whole, or it can be triggered by just one of the coordinated words. In spite of the amount of information that can be read on this topic in grammars of Ancient Greek, much is still to be known even at a general descriptive level. More importantly, the data still lack a convincing explanation. In this paper, we focus on a special domain of agreement (subject and verb agreement) and on one morphological feature that is expected to covary (number). We discuss the agreement in number for conjoined phrases, by revising some of the modern hypotheses with the support of the empirical evidence that can be collected from the available syntactically annotated corpora of Ancient Greek (treebanks). Results are interpreted according to syntactic features, cognitive factors and semantic properties of the coordinated phrases.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.