2007
DOI: 10.1007/s10618-006-0049-3
|View full text |Cite
|
Sign up to set email alerts
|

Compression-based data mining of sequential data

Abstract: The vast majority of data mining algorithms require the setting of many input parameters. The dangers of working with parameter-laden algorithms are twofold. First, incorrect settings may cause an algorithm to fail in finding the true patterns. Second, a perhaps more insidious problem is that the algorithm may report spurious patterns that do not really exist, or greatly overestimate the significance of the reported patterns. This is especially likely when the user fails to understand the role of parameters in… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
73
0
9

Year Published

2009
2009
2019
2019

Publication Types

Select...
5
4
1

Relationship

1
9

Authors

Journals

citations
Cited by 100 publications
(82 citation statements)
references
References 36 publications
(47 reference statements)
0
73
0
9
Order By: Relevance
“…DTW has been applied to find similarity metrics between signals [26] in particular in the context of gesture recognition, given the potentially more limited number of required samples [33]. More recently, researchers have also developed the derivative DTW (DDTW) [25] and used it to detect activities such as walking, going up and down flights of stairs [39].…”
Section: Dynamic Time Warping (Dtw)mentioning
confidence: 99%
“…DTW has been applied to find similarity metrics between signals [26] in particular in the context of gesture recognition, given the potentially more limited number of required samples [33]. More recently, researchers have also developed the derivative DTW (DDTW) [25] and used it to detect activities such as walking, going up and down flights of stairs [39].…”
Section: Dynamic Time Warping (Dtw)mentioning
confidence: 99%
“…Instead, the CK measure works in the spirit of Li and Vitanyi's idea that two objects can be considered similar if information garnered from one can help compress the other (Li et al 2003;Keogh et al 2007). The theoretical implications of this idea have been heavily explored over the last eight years, and numerous applications for discrete data (DNA, natural languages) have emerged.…”
Section: A Review Of the Ck Measurementioning
confidence: 99%
“…NCD was introduced by Cilibrasi and its applicability to various problems such as clustering and classification was demonstrated [8]. NCD has been experimentally evaluated on a number of problems, including: classification of biological sequences and structures [15], novelty detection in patient histories [13], mining of sequential data [23], and even static analysis of source code [3]. It has also been evaluated in terms of the impact of information distortion on the compression [19].…”
Section: Aim and Scopementioning
confidence: 99%