This article proposes a methodology for addressing three long-standing problems of near synonym research. First, we show how the internal structure of a group of near synonyms can be revealed. Second, we deal with the problem of distinguishing the subclusters and the words in those subclusters from each other. Finally, we illustrate how these results identify the semantic properties that should be mentioned in lexicographic entries. We illustrate our methodology with a case study on nine near synonymous Russian verbs that, in combination with an infinitive, express TRY.Our approach is corpus-linguistic and quantitative: assuming a strong correlation between semantic and distributional properties, we analyze 1,585 occurrences of these verbs taken from the Amsterdam Corpus and the Russian National Corpus, supplemented where necessary with data from the Web. We code each particular instance in terms of 87 variables (a.k.a. ID tags), i. e., morphosyntactic, syntactic and semantic characteristics that form a verb's behavioral profile. The resulting co-occurrence table is evaluated by means of a hierarchical agglomerative cluster analysis and additional quantitative methods. The results show that this behavioral profile approach can be used (i) to elucidate the internal structure of the group of near synonymous verbs and present it as a radial network structured around a prototypical member and (ii) to make explicit the scales of variation along which the near synonymous verbs vary.
The goal of the present study is to understand the role orthographic and semantic information play in the behaviour of skilled readers. Reading latencies from a self-paced sentence reading experiment in which Russian near-synonymous verbs were manipulated appear well-predicted by a combination of bottom-up sub-lexical letter triplets (trigraphs) and top-down semantic generalizations, modelled using the Naive Discrimination Learner. The results reveal a complex interplay of bottom-up and top-down support from orthography and semantics to the target verbs, whereby activations from orthography only are modulated by individual differences. Using performance on a serial reaction time task for a novel operationalization of the mental speed hypothesis, we explain the observed individual differences in reading behaviour in terms of the exploration/exploitation hypothesis from Reinforcement Learning, where initially slower and more variable behaviour leads to better performance overall.
Over the past 10 years, Cognitive Linguistics has taken a Quantitative Turn. Yet, concerns have been raised that this preoccupation with quantification and modelling may not bring us any closer to understanding how language works. We show that this objection is unfounded, especially if we rely on modelling techniques based on biologically and psychologically plausible learning algorithms. These make it possible to take a quantitative approach, while generating and testing specific hypotheses that will advance our understanding of how knowledge of language emerges from exposure to usage.
Linguistic convention typically allows speakers several options. Evidence is accumulating that the various options are preferred in different contexts, yet the criteria governing the selection of the appropriate form are often far from obvious. Most researchers who attempt to discover the factors determining a preference rely on the linguistic analysis and statistical modeling of data extracted from large corpora. In this paper, we address the question of how to evaluate such models and explicitly compare the performance of a statistical model derived from a corpus with that of native speakers in selecting one of six Russian TRY verbs. Building on earlier work we trained a polytomous logistic regression model to predict verb choice given the sentential context. We compare the predictions the model makes for 60 unseen sentences to the choices adult native speakers make in those same sentences. We then look in more detail at the interplay of the contextual properties and model computationally how individual differences in assessing the importance of contextual properties may impact the linguistic knowledge of native speakers. Finally, we compare the probability the model assigns to encountering each of the six verbs in the 60 test sentences to the acceptability ratings the adult native speakers give to those sentences. We discuss the implications of our findings for both usage-based theory and empirical linguistic methodology.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.