A new set of parameters to describe the word frequency behavior of texts is proposed. The analogy between the word frequency distribution and the Bose-distribution is suggested and the notion of "temperature" is introduced for this case. The calculations are made for English, Ukrainian, and the Guinean Maninka languages. The correlation between in-deep language structure (the level of analyticity) and the defined parameters is shown to exist.
We present results of network analysis of Ukrainian texts. Autosemantic (meaningful) words are considered as network vertices connected with links when belonging to one sentence. Subnetworks corresponding to specific parts of speech (verbs, nouns, adjectives, etc.) are also built. The obtained networks are small-world and scale-free. To make comparisons, random texts with parameters corresponding to real texts are generated using several approaches. Various parameters of networks are calculated, including transitivity, betweenness, degree centralization, mean distance, network diameter, exponents of degree distribution, etc. Comparison of network parameters of real and generated texts shows that borders between them are quite fuzzy.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.