“…Compression algorithms are especially efficient when examining natural language because they contain so many redundancies (Brillouin, 2004). Scholars have demonstrated how zipping can be used in measuring language similarity between two or more texts (Baronchelli, Caglioti, & Loreto, 2005) and how compression algorithms and entropybased approaches are useful to measure online texts (Gordon, Cao, & Swanson, 2007;Huffaker, Jorgensen, Iacobelli, Tepper, & Cassell, 2006;Nigam, Lafferty, & McCallum, 1999;Schneider, 1996). In our study, entropy represents the number of similar linguistic choices at both word and phrase levels.…”