This paper presents the ParlaMint corpora containing transcriptions of the sessions of the 17 European national parliaments with half a billion words. The corpora are uniformly encoded, contain rich meta-data about 11 thousand speakers, and are linguistically annotated following the Universal Dependencies formalism and with named entities. Samples of the corpora and conversion scripts are available from the project’s GitHub repository, and the complete corpora are openly available via the CLARIN.SI repository for download, as well as through the NoSketch Engine and KonText concordancers and the Parlameter interface for on-line exploration and analysis.
This chapter analyzes the use of attributive
adjectives as nominal pre-modifiers in two corpora: CorAChem (Corpus
of Articles in Chemistry) and CorAAL (Corpus of Articles in Applied
Linguistics) composed of 150 articles each. Our aim is to understand
the use of pre-modifying adjectives in noun phrases (NPs),
considering which adjectives are used as well as the NP size and
frequency across disciplines. The results show that both corpora
carry more classifiers than descriptors. Nevertheless, each
discipline favors the use of adjectives with specific functions.
Both corpora use long premodifying sequences, but CorAChem carries a
greater number of longer sequences. The use of long NPs may affect
understanding due to the interrelation among the NP constituents as
shown in CorAChem.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.