2006
DOI: 10.1007/s10994-006-5834-0
|View full text |Cite
|
Sign up to set email alerts
|

Propositionalization-based relational subgroup discovery with RSD

Abstract: Relational rule learning algorithms are typically designed to construct classification and prediction rules. However, relational rule learning can be adapted also to subgroup discovery. This paper proposes a propositionalization approach to relational subgroup discovery, achieved through appropriately adapting rule learning and first-order feature construction. The proposed approach was successfully applied to standard ILP problems (East-West trains, King-Rook-King chess endgame and mutagenicity prediction) an… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
64
0

Year Published

2006
2006
2020
2020

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 87 publications
(65 citation statements)
references
References 29 publications
(54 reference statements)
1
64
0
Order By: Relevance
“…The clusters of the best clusterings according to clustering quality and expert evaluation are further interpreted by the corresponding non-image data. For the interpretation of image clusters, we employ relational subgroup discovery (RSD) [10,17], an algorithm for finding interesting subgroups in data. In subgroup discovery, the goal is to find subgroup descriptions (typically conjunctions of attribute values as in rule learning) for which the distribution of examples with respect to a specified target variable is "unusual" compared to the overall target distribution.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…The clusters of the best clusterings according to clustering quality and expert evaluation are further interpreted by the corresponding non-image data. For the interpretation of image clusters, we employ relational subgroup discovery (RSD) [10,17], an algorithm for finding interesting subgroups in data. In subgroup discovery, the goal is to find subgroup descriptions (typically conjunctions of attribute values as in rule learning) for which the distribution of examples with respect to a specified target variable is "unusual" compared to the overall target distribution.…”
Section: Methodsmentioning
confidence: 99%
“…In the second step, we explained the clusters by relating them to clinical and other non-image variables. To do so, we employed RSD [10,17], an algorithm for relational subgroup discovery, with the cluster membership of patients as the target variable. After extracting relevant information from 200 GB of data (removing duplicates, intermediate results, and incompletely processed images), we obtained a dataset comprising 10 GB of image data from 454 PETs, and 42 variables from clinical and demographical data organized in 11 relations of a relational database.…”
Section: Introductionmentioning
confidence: 99%
“…In order to evaluate how BCP performs, it will be compared with RSD (Železný and Lavrač, 2006), a well-known propositionalization algorithm for which an implementation is available at (http://labe.felk.cvut.cz/~zelezny/rsd). RSD is a system which tackles the Relational Subgroup Discovery problem: given a population of individuals and a property of interest, RSD seeks to find population subgroups that are as large as possible and have the most unusual distribution characteristics.…”
Section: Propositionalizationmentioning
confidence: 99%
“…We have compared CILP++ with Aleph (Srinivasan, 2007) -a state-of-the-art ILP system based on Progol -and compared BCP with a well-known propositionalization method, RSD (Železný and Lavrač, 2006), using neural networks and the C4.5 decision tree learner (Quinlan, 1993), on a number of benchmarks: four Alzheimer's datasets (King and Srinivasan, 1995) and the Mutagenesis (Srinivasan and Muggleton, 1994), KRK (Bain and Muggleton, 1994) and UW-CSE (Richardson and Domingos, 2006) datasets. Several aspects were empirically evaluated: standard classification accuracy using cross-validation and runtime measurements, how BCP performs in comparison with RSD, and how CILP++ performs in different settings using feature selection (Guyon and Elisseeff, 2003).…”
Section: Introductionmentioning
confidence: 99%
“…Following a significant body of work in ILP [1], in our work a feature corresponds to a clause, and it holds for a sequence if the clause satisfies the sequence. We followed the approach described in [2] to map each sequence in a set of features.…”
Section: Clustering Protein Sequencesmentioning
confidence: 99%