We present a parallel corpus study on the expression of the temporal construction ‘not…until’ in a sample of European languages. We use data from the Europarl corpus and create semantic maps by multidimensional scaling, in order to analyze cross-linguistic and language-internal variation. This paper builds on formal semantic and typological work, extending it by including conditional constructions, as well as connectives of the type as long as. In an investigation of 7 languages, we find that (i) languages use many more different constructions to convey this meaning than was expected from the literature; and (ii) the combination of polarity marking (negation/assertion) strongly correlates with the type of connective. We corroborate our results in a larger sample of 21 European languages. An analysis of clusters and dimensions of the semantic maps based on the enlarged dataset shows that connectives are not randomly distributed across the semantic space of the ‘not…until’-domain.
This paper provides an in-depth analysis of the semantic and syntactic properties of all-clefts (All I ate for dinner was a salad). The main characteristic of all-clefts is the inference that what is designated by the cleft is not much (the “smallness effect”). On the basis of novel observations on all-clefts with multi-clausal precopular clauses, and the interaction with negation and questions, I argue for three claims: (i) the word all is the head of a relative clause (not a free relative), (ii) the precopular clause is derived by syntactic movement, and (iii) the source of the smallness effect is the mirativity of only (Beaver & Clark 2008; Zeevat 2009). The little formal work that exists on all-clefts (Homer 2019) does not offer an analysis that reflects these three claims. Instead I propose a derivational account of all-clefts based on Boeckx (2007).
This paper surveys the strategies that the Contrastive, Typological, and Translation Mining parallel corpus traditions rely on to deal with the issue of target language representativeness of translations. On the basis of a comparison of the corpus architectures and research designs of the three traditions, we argue that they have each developed their own representativeness strategies: (i) monolingual control corpora (Contrastive tradition), (ii) limits on the scope of research questions (Typological tradition), and (iii) parallel control corpora (Translation Mining tradition). We introduce normalized pointwise mutual information (NPMI) as a bi-directional measure of cross-linguistic association, allowing for an easy comparison of the outcomes of different traditions and the impact of the monolingual and parallel control corpus representativeness strategies. We further argue that corpus size has a major impact on the reliability of the monolingual control corpus strategy and that a sequential parallel control corpus strategy is preferable for smaller corpora.
This paper reports on the state-of-the-art in application of multidimensional scaling (MDS) techniques to create semantic maps in linguistic research. MDS refers to a statistical technique that represents objects (lexical items, linguistic contexts, languages, etc.) as points in a space so that close similarity between the objects corresponds to close distances between the corresponding points in the representation. We focus on the use of MDS in combination with parallel corpus data as used in research on cross-linguistic variation. We first introduce the mathematical foundations of MDS and then give an exhaustive overview of past research that employs MDS techniques in combination with parallel corpus data. We propose a set of terminology to succinctly describe the key parameters of a particular MDS application. We then show that this computational methodology is theory-neutral, i.e. it can be employed to answer research questions in a variety of linguistic theoretical frameworks. Finally, we show how this leads to two lines of future developments for MDS research in linguistics.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.