2003
DOI: 10.1103/physrevlett.90.089803
|View full text |Cite
|
Sign up to set email alerts
|

Comment on “Language Trees and Zipping”

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
15
0

Year Published

2005
2005
2015
2015

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 20 publications
(15 citation statements)
references
References 7 publications
0
15
0
Order By: Relevance
“…This method was strongly criticized by several researchers (Goodman, 2002;Khmelev & Teahan, 2003b) indicating many weaknesses. First, it is too slow since it has to call the compression algorithm so many times (as many as the training texts).…”
Section: Similarity-based Modelsmentioning
confidence: 99%
“…This method was strongly criticized by several researchers (Goodman, 2002;Khmelev & Teahan, 2003b) indicating many weaknesses. First, it is too slow since it has to call the compression algorithm so many times (as many as the training texts).…”
Section: Similarity-based Modelsmentioning
confidence: 99%
“…Looking for better approximations, however, we must not lose general applicability to arbitrary domains. In [6] it is argued that nth order Markov chain models perform better than gzip for text recognition. But optimization to particular domains might endanger generality.…”
Section: A Complex Task: Image Retrievalmentioning
confidence: 99%
“…This suggests that compression algorithms might be useful to offer a principled way to measure pattern similarity. For the domain of text, Benedetto et al have shown that ordinary file compression tools based on the Lempel-Ziv algorithm (LZ77) [1] can perform language recognition and authorship attribution, although these programs were never designed for this kind of tasks [2], [3] (for comments see [4], [5], [6]). Given n versions T 1 .…”
Section: A Recognition By Compressionmentioning
confidence: 99%
“…Perhaps the most controversial work is that of Benedetto et al [46] who addressed the authorship authentication problem using gzip coupled with BCN (discussed in Section III-B1). To get more information about what happened, the interested reader is referred to following sources [47], [48], [49], [50], [51].…”
Section: B Previous Work On Ctcmentioning
confidence: 99%