Discussion of Indo-European origins and dispersal focuses on two hypotheses. Qualitative evidence from reconstructed vocabulary and correlations with archaeological data suggest that Indo-European languages originated in the Pontic-Caspian steppe and spread together with cultural innovations associated with pastoralism, beginning c. 6500–5500 bp. An alternative hypothesis, according to which Indo-European languages spread with the diffusion of farming from Anatolia, beginning c. 9500–8000 bp, is supported by statistical phylogenetic and phylogeographic analyses of lexical traits. The time and place of the Indo-European ancestor language therefore remain disputed. Here we present a phylogenetic analysis in which ancestry constraints permit more accurate inference of rates of change, based on observed changes between ancient or medieval languages and their modern descendants, and we show that the result strongly supports the steppe hypothesis. Positing ancestry constraints also reveals that homoplasy is common in lexical traits, contrary to the assumptions of previous work. We show that lexical traits undergo recurrent evolution due to recurring patterns of semantic and morphological change.
This article investigates the evolutionary and spatial dynamics of typological characters in 117 Indo-European languages. We partition types of change (i.e., gain or loss) for each variant according to whether they bring about a simplification in morphosyntactic patterns that must be learned, whether they are neutral (i.e., neither simplifying nor introducing complexity) or whether they introduce a more complex pattern. We find that changes which introduce complexity show significantly less areal signal (according to a metric we devise) than changes which simplify and neutral changes, but we find no significant differences between the latter two groups. This result is compatible with a scenario where certain types of parallel change are more likely to be mediated by advergence and contact between proximate speech communities, while other developments are due purely to drift and are largely independent of intercultural contact.
This study uses phylogenetic methods adopted from computational biology in order to reconstruct features of Proto-Indo-European morphosyntax. We estimate the probability of the presence of typological features in Proto-Indo-European on the assumption that these features change according to a stochastic process governed by evolutionary transition rates between them. We compare these probabilities to previous reconstructions of Proto-Indo-European morphosyntax, which use either the comparative-historical method or implicational typology. We find that our reconstruction yields strong support for a canonical model (synthetic, nominative-accusative, headfinal) of the protolanguage and low support for any alternative model. Observing the evolutionary dynamics of features in our data set, we conclude that morphological features have slower rates of change, whereas syntactic traits change faster. Additionally, more frequent, unmarked traits in grammatical hierarchies have slower change rates when compared to less frequent, marked ones, which indicates that universal patterns of economy and frequency impact language change within the family.
The supplementary materials contain details of our BEAST analyses. Note that in addition to the four blocks of analyses (A-D) mentioned in Experiments ( §6), there is a fifth block of eight analyses (E1-8) in which each ancestral language in A1, in turn, has its ancestry constraint removed. This document contains the following: S.1 A synopsis of elements that vary between analyses. S.2 A description of accompanying electronic files. S.3 Summary trees and summary statistics for each phylogenetic analysis. S.1 Analyses names and descriptorsEach BEAST analysis has a name (e.g. A1) and a descriptor (e.g. a1-c0-d0-g1-l2-s1-t1-z8) whose elements denote the following: a Ancestry constraints. 0 No. 1 Yes. c A set of clade and time constraints, and a set of languages ( §4.1). 0 Narrow dataset, no time constraints on splits. 1 Medium dataset, no time constraints on splits. 2 Broad dataset, no time constraints on splits. 3 Narrow dataset, time constraints on splits in Table 12. 4 From Bouckaert et al. (2013). 5 Same as c4, but excluding six sparsely-attested languages. s Trait model. 0 SDC (no among-trait rate variation). 1 RSC with gamma-variate among-trait rate variation. 2 RSC without among-trait rate variation. 4 Covarion without among-trait rate variation. t Tree prior. 1 Generalized skyline coalescent. 2 Constant population coalescent.u If given, the adjoining eight bit number specifies, for each ancestral language, whether an ancestry constraint is used (1) or not used (0). x If given, the adjoining eight bit number specifies, for each ancestry-constrained language, whether its time constraint is ignored (1) or used (0). z If given, the adjoining number specifies the MCMC chain length, in multiples of 2 × 10 7 ; or else the chain length is 2 × 10 7 . S.2 Electronic filesThe datasets d0 and d2 are in files sup/ielex-130421-ag-cc.txt and sup/ielex-betal.txt, respectively. The files associated with each analysis can be found in the directory sup/runs-post/ descriptor . Each such directory contains: ieo.xml XML configuration file. info.txt Information about the configuration. ieo.log.sum Statistics summarizing posterior. mcc.trees Summary tree.As described in §5.3, we also run BEAST with the data removed in order to find the prior distribution of the root age. The associated files can be found in the directory sup/runs-prior/ descriptor . The descriptor in this case lacks the d, g, l, and s elements; e.g. the directory for A1 is sup/runs-prior/ a1-c0-t1-z8.The XML configurations must be run using a customized version of BEAST, available at https: //github.com/whdc/ieo-beast. After compiling BEAST, run beast ieo.xml from a UNIX shell. S.3 Summary trees and statisticsFor each analysis we give a plot of the MCC tree and summaries of other parameters in the posterior sample. Plots show the branch rate multipliers (width of horizontal lines), time constraints on ancient and medieval languages (bright red bars), clade constraints (vertical black bars), and posterior clade probabilities that round to less than 100%. There are 3...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.