2020
DOI: 10.3390/a13050123
|View full text |Cite
|
Sign up to set email alerts
|

Mining Sequential Patterns with VC-Dimension and Rademacher Complexity

Abstract: Sequential pattern mining is a fundamental data mining task with application in several domains. We study two variants of this task—the first is the extraction of frequent sequential patterns, whose frequency in a dataset of sequential transactions is higher than a user-provided threshold; the second is the mining of true frequent sequential patterns, which appear with probability above a user-defined threshold in transactions drawn from the generative process underlying the data. We present the first sampling… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
5
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
2
1

Relationship

2
5

Authors

Journals

citations
Cited by 13 publications
(5 citation statements)
references
References 30 publications
0
5
0
Order By: Relevance
“…The following theorem is a generalization of a result for sequential patterns appearing in [22]. Here, we provide it for a general pattern mining task.…”
Section: Range Space Of Patternsmentioning
confidence: 89%
See 2 more Smart Citations
“…The following theorem is a generalization of a result for sequential patterns appearing in [22]. Here, we provide it for a general pattern mining task.…”
Section: Range Space Of Patternsmentioning
confidence: 89%
“…In this work, we propose a tighter upper bound on the capacity of a sequence to compute it and we apply it in a different scenario. More recently, Santoro et al [22] provide a sampling-based algorithm to compute approximations for the frequent sequential patterns problem, based on an upper bound on the VC-dimension of sequential patterns. They are also the first who consider the problem of mining true frequent sequential patterns, that are frequent sequential patterns w.r.t.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Diego Santoro [18] have introduced a samplingbased algorithm to mine the frequent items from huge databases. The introduced sampling algorithm utilized the ideology of VC-dimension which helps in approximating the frequent sequential pattern.…”
Section: Related Workmentioning
confidence: 99%
“…MCRapper instead computes the exact n-MCERA of the family of interest on the observed sample, without having to consider the worst case. For other kinds of patterns, Riondato and Vandin [20] studied the pseudodimension of subgroups, while Servan-Schreiber et al [22] and Santoro et al [21] considered the (empirical) VC-dimension and Rademacher averages for sequential patterns. MCRapper can be applied in all these cases, and obtains better bounds because it uses the sample-and-distribution-dependent n-MCERA, rather than a worst case dataset-dependent bound.…”
Section: Related Workmentioning
confidence: 99%