We constructed a corpus of digitized texts containing about 4% of all books ever printed. Analysis of this corpus enables us to investigate cultural trends quantitatively. We survey the vast terrain of ‘culturomics’, focusing on linguistic and cultural phenomena that were reflected in the English language between 1800 and 2000. We show how this approach can provide insights about fields as diverse as lexicography, the evolution of grammar, collective memory, the adoption of technology, the pursuit of fame, censorship, and historical epidemiology. ‘Culturomics’ extends the boundaries of rigorous quantitative inquiry to a wide array of new phenomena spanning the social sciences and the humanities.
Bacterial pathogens evolve during the infection of their human hosts1-8, but separating adaptive and neutral mutations remains challenging9-11. Here, we identify bacterial genes under adaptive evolution by tracking recurrent patterns of mutations in the same pathogenic strain during the infection of multiple patients. We conducted a retrospective study of a Burkholderia dolosa outbreak among people with cystic fibrosis, sequencing the genomes of 112 isolates collected from 14 individuals over 16 years. We find that 17 bacterial genes acquired non-synonymous mutations in multiple individuals, which indicates parallel adaptive evolution. Mutations in these genes illuminate the genetic basis of important pathogenic phenotypes, including antibiotic resistance and bacterial membrane composition, and implicate oxygen-dependent gene regulation as paramount in lung infections. Several genes have not been previously implicated in pathogenesis, suggesting new therapeutic targets. The identification of parallel molecular evolution suggests key selection forces acting on pathogens within humans and can help predict and prepare for their future evolutionary course.
Human language is based on grammatical rules 1-4 . Cultural evolution allows these rules to change over time 5 . Rules compete with each other: as new rules rise to prominence, old ones die away. To quantify the dynamics of language evolution, we studied the regularization of English verbs over the last 1200 years. Although an elaborate system of productive conjugations existed in English's protoGermanic ancestor, modern English uses the dental suffix, -ed, to signify past tense 6 . Here, we describe the emergence of this linguistic rule amidst the evolutionary decay of its exceptions, known to us as irregular verbs. We have generated a dataset of verbs whose conjugations have been evolving for over a millennium, tracking inflectional changes to 177 Old English irregulars. Of these irregulars, 145 remained irregular in Middle English and 98 are still irregular today. We study how the rate of regularization depends on the frequency of word usage. The half-life of an irregular verb scales as the square root of its usage frequency: a verb that is 100 times less frequent regularizes 10 times as fast. Our study provides a quantitative analysis of the regularization process by which ancestral forms gradually yield to an emerging linguistic rule.Natural languages comprise elaborate systems of rules which enable one speaker to communicate with another 7 . These rules serve to simplify the production of language and enable an infinite array of comprehensible formulations 8-10 . Yet each rule has exceptions, and even the rules themselves wax and wane over centuries and millennia 11,12 .Verbs which obey standard rules of conjugation in their native language are called regular verbs 13 . In the modern English language, regular verbs are conjugated into the simple past and past participial forms by appending the dental suffix -ed to the root (for instance, talk/talked/ talked). Irregular verbs obey antiquated rules (sing/sang/sung) or in some cases, no rule at all (go/went) 14,15 .New verbs entering English universally obey the regular conjugation (google/googled/ googled), and many irregular verbs eventually regularize. Regular verbs become irregular much more rarely: for every sneak that snuck in 16 , there are many more flews that flied out.Reprints and permissions information is available at npg.nature.com/reprintsandpermissions. The authors declare no competing financial interests. Correspondence and requests for materials should be addressed to M. A. N. (martin_nowak@harvard.edu). * These authors contributed equally to this work.Supplementary Information is linked to the online version of the paper at www.nature.com/nature. Although less than 3% of modern verbs are irregular, the ten most common verbs are all irregular (be, have, do, go, say, can, will, see, take, get). The irregular verbs are heavily biased towards high frequencies of occurrence 17,18 . Linguists have suggested an evolutionary hypothesis underlying the frequency distribution of irregular verbs: uncommon irregular verbs tend to disappear more r...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.