The paper presents the results of a study that is part of a large-scale project aimed at studying the changes that took place in the Russian language during the first three decades of the 20th century. In the history of Russia, this period was marked by stormy events that led to a radical change in the state system and the formation of a new society. To quantify the scale of changes that occurred in the language in the result of these dramatic events, it is necessary to analyze the representative volume of linguistic data and to compare different chronological periods in dynamics using quantitative methods. The research was carried out on the data of an annotated sample from the Corpus of the Russian Short Stories of 1900-1930, which contains texts by 300 Russian writers. All the texts in the Corpus are divided into three time frames: 1) the pre-war period (1900)(1901)(1902)(1903)(1904)(1905)(1906)(1907)(1908)(1909)(1910)(1911)(1912)(1913), 2) the war and revolutionary years (1914)(1915)(1916)(1917)(1918)(1919)(1920)(1921)(1922) and 3) the early Soviet period (1923)(1924)(1925)(1926)(1927)(1928)(1929)(1930). Frequency distribution of significant vocabulary in dynamics was analyzed, which made it possible to identify the main tendencies in the change of individual words and lexical groups frequencies from one historical period to another and to correlate them with the previously identified dynamics of literary themes. The technique used allows to trace the influence of large-scale political changes on the vocabulary of literary language, to note the peculiarities and tendencies of the writers' worldview in a certain historical period, and also makes it possible to significantly supplement the analysis of the dynamics of literary themes in fiction.
The book offers a novel approach for analyzing natural language texts and representing relevant information in a metalinguistic database structure. Methodological principles underlying the processing techniques are the same at all stages. They draw on an empirically-based theory providing for the well-groundedness and consistency of the rules and preventing any possible subjectivity. The ultimate goal of the processing is a conceptual structure made up of elements referring to extralinguistic objects and situations. The elements are assigned ontological properties and linked together by conceptual relations. This makes it possible to extract “pure” information from the text, its reliability being dependent solely on the author.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.