“…Analysis of natural language from a complex systems perspective has provided new insights into statistical properties of language, such as statistical laws [ 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 ], networks [ 10 , 11 , 12 , 13 , 14 ], language change [ 15 , 16 , 17 , 18 , 19 , 20 ], quantification of information content [ 21 , 22 , 23 , 24 ], or the role of syntactic structures [ 25 ] or punctuation [ 26 ], etc. In particular, the availability of new and large publicly available datasets such as the google-ngram data [ 27 ], the full Wikipedia dataset [ 28 , 29 ], or Twitter [ 30 ] opened the door for new large-scale quantitative approaches.…”