2010
DOI: 10.1007/s12293-010-0048-1
|View full text |Cite
|
Sign up to set email alerts
|

Stratified prototype selection based on a steady-state memetic algorithm: a study of scalability

Abstract: Prototype selection (PS) is a suitable data reduction process for refining the training set of a data mining algorithm. Performing PS processes over existing datasets can sometimes be an inefficient task, especially as the size of the problem increases. However, in recent years some techniques have been developed to avoid the drawbacks that appeared due to the lack of scalability of the classical PS approaches. One of these techniques is known as stratification. In this study, we test the combination of strati… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
19
0

Year Published

2011
2011
2020
2020

Publication Types

Select...
4
3
1

Relationship

2
6

Authors

Journals

citations
Cited by 31 publications
(19 citation statements)
references
References 45 publications
0
19
0
Order By: Relevance
“…Similar principles can be applied to other data mining methods and problems [50]. One of the problems with stratification is determining the optimum the size of the strata [38].…”
Section: Sampling and Data Partitioningmentioning
confidence: 95%
See 1 more Smart Citation
“…Similar principles can be applied to other data mining methods and problems [50]. One of the problems with stratification is determining the optimum the size of the strata [38].…”
Section: Sampling and Data Partitioningmentioning
confidence: 95%
“…Of course, the drawback of this approach is the likely decrease in the performance of the algorithm. Derrac et al [38] used stratification to scale up a steady-state memetic algorithm for instance selection. Similar principles can be applied to other data mining methods and problems [50].…”
Section: Sampling and Data Partitioningmentioning
confidence: 99%
“…Despite the promising results shown by PR techniques with small and medium data sets, they lack of scalability to address big T R data sets (from tens of thousands of instances onwards [29]). The main problems found to deal with large-scale data are:…”
Section: Prototype Reduction and Big Datamentioning
confidence: 99%
“…Then, it joins each partial reduced set into a global solution. This approach has been used for instance selection [28,29] and generation [30] with promising results. However, two main problems appear when we increase the data set size:…”
Section: Introductionmentioning
confidence: 99%
“…Despite their bright performance with medium size problems, they lack of scalability with big data sets (from tens of thousands of instances [16]). Their main problems are:…”
Section: A Prototype Generation and Big Datamentioning
confidence: 99%