2021
DOI: 10.1109/access.2021.3081366
|View full text |Cite
|
Sign up to set email alerts
|

Multiple Filter-Based Rankers to Guide Hybrid Grasshopper Optimization Algorithm and Simulated Annealing for Feature Selection With High Dimensional Multi-Class Imbalanced Datasets

Abstract: DNA microarray data analysis is infamous due to a massive number of features, imbalanced class distribution, and limited available samples. In this paper, we focus on high-dimensional multi-class imbalanced problems. The high dimensional and multi-class imbalanced problem has posed acute challenges for the conventional classifiers to effectively perform classification tasks on both the minority and majority classes. Numerous efforts have been devoted to addressing either high dimensionality dataset or class im… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
3
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 14 publications
(3 citation statements)
references
References 103 publications
0
3
0
Order By: Relevance
“…Common fraud detection datasets share certain characteristics essential for the effectiveness of machine learning models. Imbalanced class distribution, where the majority of instances are nonfraudulent, mirrors the real-world scenario but necessitates careful handling to prevent model bias [70]. Temporal aspects, such as time-based patterns in transaction data, are often present to capture the dynamic nature of fraudulent activities.…”
Section: Common Fraud Detection Datasetsmentioning
confidence: 99%
“…Common fraud detection datasets share certain characteristics essential for the effectiveness of machine learning models. Imbalanced class distribution, where the majority of instances are nonfraudulent, mirrors the real-world scenario but necessitates careful handling to prevent model bias [70]. Temporal aspects, such as time-based patterns in transaction data, are often present to capture the dynamic nature of fraudulent activities.…”
Section: Common Fraud Detection Datasetsmentioning
confidence: 99%
“…Using statistical models for feature selection has several advantages. These models can help identify the most essential features in a dataset, reduce the dimensionality of the data, and improve the performance of machine learning algorithms [ 13 15 ]. However, wrapper models for feature selection can have drawbacks, such as randomness and unstable results.…”
Section: Introductionmentioning
confidence: 99%
“…Shahee et al(Shahee & Ananthakumar, 2020) used the effective complex distance measure to properly select the iteratively modified features, so as to obtain the final feature ranking and the unbalanced features between and within the classes are combined. Sharifai et al(Sharifai & Zainol, 2021) combined the feature ranking of the six filters to select the feature set that exceeds the set threshold, then searched the feature space and mine high-quality features to enhance the capability to predict minority classes. Kim et al(Kim, Kang, & Sohn, 2021) applied the ensemble learning paradigm to the feature evaluation process through the feature evaluation scheme based on filtering method, which accurately recognized the features with good robustness and greatly reduces the calculation time.…”
mentioning
confidence: 99%