2016
DOI: 10.1142/s0217979215410052
|View full text |Cite
|
Sign up to set email alerts
|

Long-range correlations and burstiness in written texts: Universal and language-specific aspects

Abstract: Recently, methods from the statistical physics of complex systems have been applied successfully to identify universal features in the long-range correlations (LRCs) of written texts. However, in real texts, these universal features are being intermingled with language-specific influences. This paper aims at the characterization and further understanding of the interplay between universal and language-specific effects on the LRCs in texts. To this end, we apply the language-sensitive mapping of written texts t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(3 citation statements)
references
References 21 publications
0
3
0
Order By: Relevance
“…As a result of a statistical analysis of a multi-language corpus of physicsrelated texts, Constantoudis et al (2015) demonstrated how the burstiness of long word appearances contributes more to the language-specific aspects of full-wordlength correlations. The authors concluded that the correlations between inter-long word distances are less sensitive to language dependencies.…”
Section: Perplexity and Burstiness In Written Textsmentioning
confidence: 99%
“…As a result of a statistical analysis of a multi-language corpus of physicsrelated texts, Constantoudis et al (2015) demonstrated how the burstiness of long word appearances contributes more to the language-specific aspects of full-wordlength correlations. The authors concluded that the correlations between inter-long word distances are less sensitive to language dependencies.…”
Section: Perplexity and Burstiness In Written Textsmentioning
confidence: 99%
“…Several papers have used similar methods to map texts into token length sequences [24,[56][57][58][59]. The token length sequence mapping method (Eq.…”
Section: Token Length Sequence (Tls)mentioning
confidence: 99%
“…For example, co-occurrence networks might fail at capturing the relationship between distant words. Constantoudis et al (2015) reported that long-range correlations in written texts occur due to the multidimensional mapping of thoughts and ideas in chains of words. In order to account for the presence of relevant links between non-adjacent words, we proposed an extension called Further Neighborhood.…”
Section: Extensions Of Co-occurrence Networkmentioning
confidence: 99%