The Iranian language family is the western branch of the Indo-Iranian language group which itself belongs to the Indo-European language family. As Windfuhr (2009) states, “with an estimated 150–200 million native speakers, the Iranian language family is one of the world’s major language families.” The exact number of languages in this family is unknown. However, it has been estimated to be around 86 (Eberhard et al. 2019). Although there is no definite agreement about the classification of these languages, they can be roughly divided into four major groups: Northwestern, Southwestern, Northeastern and Southeastern. These languages have several properties in common, but there are also major differences among them in terms of their sound systems, syntactic and morpho-syntactic structures. These variations provide a novel and ideal laboratory for various types of linguistic research.
Bobaljik & Wurmband (2015) have recently developed a hypothesis that no language truly mixes wh-movement and wh-in-situ structures in its syntax, with seemingly optional wh-in-situ in a wh-movement language being analyzed as a question with declarative syntax. In this paper, we will present novel data from Colloquial Singapore English (CSE) which question this hypothesis. Instead of assuming that the Q-feature of the interrogative C WH head in a language must be specified in a binary manner (valued or unvalued), we will propose that this feature is underspecified in languages such as CSE. The proposed amendment is not only sufficiently restrictive to cover the type of languages predicted by B&W's original hypothesis, but also flexible enough to accommodate languages with a mixed wh-system. We will further argue that contact-based explanations, though plausible, do not have to be taken as a reason for CSE to develop this specific trait, which could have developed under independent, non-contact situations. This position is supported by Malay and Ancash Quechua, two non-contact languages which nonetheless exhibit optional wh-in-situ like CSE.
Pretrained transformer-based language models achieve state-of-the-art performance in many NLP tasks, but it is an open question whether the knowledge acquired by the models during pretraining resembles the linguistic knowledge of humans. We present both humans and pretrained transformers with descriptions of events, and measure their preference for telic interpretations (the event has a natural endpoint) or atelic interpretations (the event does not have a natural endpoint). To measure these preferences and determine what factors influence them, we design an English test and a novel-word test that include a variety of linguistic cues (noun phrase quantity, resultative structure, contextual information, temporal units) that bias toward certain interpretations. We find that humans' choice of telicity interpretation is reliably influenced by theoretically-motivated cues, transformer models (BERT and RoBERTa) are influenced by some (though not all) of the cues, and transformer models often rely more heavily on temporal units than humans do.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.