2018
DOI: 10.3127/ajis.v22i0.1538
|View full text |Cite
|
Sign up to set email alerts
|

Comparing sets of patterns with the Jaccard index

Abstract: The ability to extract knowledge from data has been the driving force of Data Mining since its inception, and of statistical modelling long before even that. Actionable knowledge often takes the form of patterns, where a set of antecedents can be used to infer a consequent. In this paper we offer a solution to the problem of comparing different sets of patterns. Our solution allows comparisons between sets of patterns that were derived from different techniques (such as different classification algorithms), or… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
51
0
1

Year Published

2019
2019
2024
2024

Publication Types

Select...
6
2
1
1

Relationship

0
10

Authors

Journals

citations
Cited by 87 publications
(52 citation statements)
references
References 35 publications
0
51
0
1
Order By: Relevance
“…Thus, there are many distance functions, which are used to define a distance between items or elements. These distance functions such as; Cosine similarity (Shirkhorshidi et al 2015), Jaccard distance (Fletcher and Slam 2018), Manhattan distance (Pandit and Gupta 2011), Euclidean distance (Dokmanic et al 2015), etc. The cluster is constructed in a way that any two data items associated with the same cluster have the minimum value of distance and any two data items associated with different clusters have the maximum value of distance (Zhu et al 2019).…”
Section: Feature Clustering Phase (Fcp)mentioning
confidence: 99%
“…Thus, there are many distance functions, which are used to define a distance between items or elements. These distance functions such as; Cosine similarity (Shirkhorshidi et al 2015), Jaccard distance (Fletcher and Slam 2018), Manhattan distance (Pandit and Gupta 2011), Euclidean distance (Dokmanic et al 2015), etc. The cluster is constructed in a way that any two data items associated with the same cluster have the minimum value of distance and any two data items associated with different clusters have the maximum value of distance (Zhu et al 2019).…”
Section: Feature Clustering Phase (Fcp)mentioning
confidence: 99%
“…However, the patterns in the frequencies of the features and their cutoffs are relatively similar to the denovo model. We use the Jaccard similarity index, weighted by features frequency [15] , to estimate this similarty and obtain a value of 0.74, which could be considered high. Similar to denovo model, the scrambled model relies on high values of backbone feature interacting with helix but focuses more on the values of sheet and sidechain features for detecting early folding residues.…”
Section: Resultsmentioning
confidence: 99%
“…From this point, the Jaccard similarity can be calculated as shown in Table 3. The Jaccard similarity (coefficient) (Fletcher and Islam, 2018) is a term coined by Paul Jaccard to measure similarities between sets. It is defined as the size of the intersection divided by the size of the union of two sets.…”
Section: Resultsmentioning
confidence: 99%