CzeDLex is a new electronic lexicon of Czech discourse connectives, planned for publication by the end of this year. Its data format and structure are based on a study of similar existing resources, and adjusted to comply with the Czech syntactic tradition and specifics and with the Prague approach to the annotation of semantic discourse relations in text.In the article, we first put the lexicon in context of related resources and discuss theoretical aspects of building the lexicon -we present arguments for our choice of the data structure and for selecting features of the lexicon entries, while special attention is paid to a consistent and (as far as possible) uniform encoding of both primary (such as in English because, therefore) and secondary connectives (e.g. for this reason, this is the reason why). The main principle adopted for nesting entries in the lexicon is -apart from the lexical form of the connective -a discoursesemantic type (sense) expressed by the given connective, which enables us to deal with a broad formal variability of connectives and is convenient for interlinking CzeDLex with lexicons in other languages.Second, we introduce the chosen technical solution based on the Prague Markup Language, which allows for an efficient incorporation of the lexicon into the family of Prague treebanksit can be directly opened and edited in the tree editor TrEd, processed from the command line in btred, interlinked with its source corpus and queried in the PML Tree Query engine.Third, we describe the process of getting data for the lexicon by exploiting a large corpus manually annotated with discourse relations -the Prague Discourse Treebank 2.0: we elaborate on the automatic extraction part, post-extraction checks and manual addition of supplementary linguistic information.
As the quality of machine translation rises and neural machine translation (NMT) is moving from sentence to document level translations, it is becoming increasingly difficult to evaluate the output of translation systems.We provide a test suite for WMT19 aimed at assessing discourse phenomena of MT systems participating in the News Translation Task. We have manually checked the outputs and identified types of translation errors that are relevant to document-level translation.
Describing implicit phenomena in discourse is known to be a problematic task, from both theoretical and empirical perspectives. The present article contributes to this topic by a novel comparative analysis of two prominent annotation approaches to discourse relations (coherence relations) that were carried out on the same texts. We compare the annotation of implicit relations in the Penn Discourse Treebank 2.0, i.e. discourse relations not signaled by an explicit discourse connective, to the recently released analysis of signals of rhetorical relations in the RST Signalling Corpus (RST-SC). The intersection of corresponding pairs of relations is rather a small one, but it shows a clear tendency: unlike the overall signal distribution in the RST-SC, more than half of the signals in the studied intersection are of semantic type, formed mostly by loosely defined lexical chains. Our data transformation allows for a simultaneous depiction and detailed study of the two resources.
Cvrčková H., Máchová P., Poláková L., Trčková O. (2017): Evaluation of the genetic diversity of selected Fagus sylvatica L. populations in the Czech Republic using nuclear microsatellites. J. For. Sci., 63: 53-61.Fagus sylvatica Linnaeus (European beech), the ecologically and economically most important broadleaved tree species in the Czech Republic, was strongly reduced in the past. Today there are efforts to increase the proportion of beech to ensure optimal forest tree species composition. When extensively reintroducing beech, it is important to acquire more detailed knowledge of genetic diversity. Thirteen important beech populations in different stands in the territory of the Czech Republic were genotyped using 12 polymorphic nuclear microsatellite markers. The genotypic data from adult trees imply genetic differences between the populations. The estimated genetic diversity expressed as Shannon's information index ranged from 1.73 to 1.92. Thirteen beech populations showed excess homozygotes, as indicated by positive fixation index (F) values (F = 0.005-0.115). The pairwise F ST values indicated low genetic differentiation between the 13 Czech beech populations, because they were greater than zero, that means they confirmed the presence of population structuring in Czech European beech. Not significant linear correlations were observed between genetic and geographic distances of the 13 beech populations studied on the basis of microsatellite markers. Twelve microsatellite markers were verified as highly polymorphic and suitable for genotyping of European beech populations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.