On detecting differences between groups

Webb, Geoffrey I.; Butler, Shane; Newlands, Douglas

doi:10.1145/956755.956781

Cited by 32 publications

(38 citation statements)

References 3 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Hence, completeness can only be ensured by a post-pruning process. The same problem was identified in Webb et al (2003) using Magnum Opus. CAREN implements two versions of this pruning process.…”

Section: Definitionmentioning

confidence: 60%

Ensembles of jittered association rule classifiers

Azevedo

Jorge

2010

Data Min Knowl Disc

View full text Add to dashboard Cite

The ensembling of classifiers tends to improve predictive accuracy. To obtain an ensemble with N classifiers, one typically needs to run N learning processes. In this paper we introduce and explore Model Jittering Ensembling, where one single model is perturbed in order to obtain variants that can be used as an ensemble. We use as base classifiers sets of classification association rules. The two methods of jittering ensembling we propose are Iterative Reordering Ensembling (IRE) and Post Bagging (PB). Both methods start by learning one rule set over a single run, and then produce multiple rule sets without relearning. Empirical results on 36 data sets are positive and show that both strategies tend to reduce error with respect to the single model association rule classifier. A bias-variance analysis reveals that while both IRE and PB are able to reduce the variance component of the error, IRE is particularly effective in reducing the bias component. We show that Model Jittering Ensembling can represent a very good speed-up w.r.t. multiple model learning ensembling. We also compare Model Jittering with various state of the art classifiers in terms of predictive accuracy and computational efficiency.

show abstract

Section: Definitionmentioning

confidence: 60%

Ensembles of jittered association rule classifiers

Azevedo

Jorge

2010

Data Min Knowl Disc

View full text Add to dashboard Cite

show abstract

“…Work in (Bay & Pazzani, 1999, 2001Webb et al, 2003) focus on mining contrast sets: conjunctions of attributes and values that differ meaningfully in their distribution across groups. Those allow us to answer queries of the form, ''How are History and Computer Science students different?"…”

Section: Related Workmentioning

confidence: 99%

“…Furthermore, software companies can devise well-performed antispam email systems based on these differences. Therefore, there are some researches reported on mining group differences between contrast groups from observational multivariate data (Bay & Pazzani, 1999, 2001Webb, Butler, & Newlands, 2003).…”

Section: Introductionmentioning

confidence: 99%

Estimating confidence intervals for structural differences between contrast groups with missing data

Qin

Zhang

Zhu

et al. 2009

Expert Systems with Applications

View full text Add to dashboard Cite

“…Some particular data mining techniques, known as contrast-set mining (Bay and Pazzani, 2001;Dong and Li, 1999;Webb et al, 2003), have been designed specifically to identify differences between databases to be contrasted.…”

Section: Related Workmentioning

confidence: 99%

“…A similar strategy is also used in STUCCO (Bay and Pazzani, 2001) to obtain characteristic itemsets in one database based on the w 2 test. In addition, Magnum Opus (Webb et al, 2003) examines relations between itemsets and a database from several databases. On the other hand, this paper seeks paired itemsets whose correlations radically increase in one database.…”

Section: Related Workmentioning

confidence: 99%