2002
DOI: 10.1007/978-0-387-35602-0_18
|View full text |Cite
|
Sign up to set email alerts
|

An Information-Theoretic Approach to the Pre-Pruning of Classification Rules

Abstract: Abstract:The automatic induction of classification rules from examples is an important technique used in data mining. One of the problems encountered is the overfitting of rules to training data. In some cases this can lead to an excessively large number of rules, many of which have very little predictive value for unseen data. This paper is concerned with the reduction of overfitting. It introduces a technique known as J-pruning, based on the Jmeasure, an information theoretic means of quantifying the informa… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
77
0

Year Published

2002
2002
2015
2015

Publication Types

Select...
7

Relationship

4
3

Authors

Journals

citations
Cited by 34 publications
(77 citation statements)
references
References 5 publications
0
77
0
Order By: Relevance
“…In this paper, we adopt PrismTCS which is a computationally efficient member of the Prism family, and also maintains a similar predictive accuracy compared with the original Prism classifier [5]. A good computational efficiency is needed as ensemble learners generally do not scale well to large datasets.…”
Section: Random Prismmentioning
confidence: 99%
See 2 more Smart Citations
“…In this paper, we adopt PrismTCS which is a computationally efficient member of the Prism family, and also maintains a similar predictive accuracy compared with the original Prism classifier [5]. A good computational efficiency is needed as ensemble learners generally do not scale well to large datasets.…”
Section: Random Prismmentioning
confidence: 99%
“…However, R-PrismTCS classifiers could also be parallelised. The Parallel Modular Classification Rule Induction (PMCRI) framework [19] for parallelising, amongst others, the PrismTCS [5] classifier, can be used for parallelising the R-PrismTCS classifier also. This is due to the similarity of the R-PrismTCS and PrismTCS classifiers.…”
Section: Empirical Scalability Studymentioning
confidence: 99%
See 1 more Smart Citation
“…Even so Prism has been shown to be less vulnerable to overfitting compared with decision trees, it is not immune. Hence pruning methods for Prism have been developed such as J-pruning [6] in order to make Prism generalising better on the input data. Because of J-pruning's generalisation capabilities we use Prism with J-pruning for the induction of rules in eRules hence it is briefly described here.…”
Section: Learning Modular Classification Rulesmentioning
confidence: 99%
“…A possible candidate for that could be the PrismTCS classifier [6] which uses a default rule for unclassified instances.…”
Section: Ongoing and Future Workmentioning
confidence: 99%