2021
DOI: 10.3390/app11177825
|View full text |Cite
|
Sign up to set email alerts
|

Classification of Imbalanced Data Represented as Binary Features

Abstract: Typically, classification is conducted on a dataset that consists of numerical features and target classes. For instance, a grayscale image, which is usually represented as a matrix of integers varying from 0 to 255, enables one to apply various classification algorithms to image classification tasks. However, datasets represented as binary features cannot use many standard machine learning algorithms optimally, yet their amount is not negligible. On the other hand, oversampling algorithms such as synthetic mi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 8 publications
(6 citation statements)
references
References 38 publications
0
6
0
Order By: Relevance
“…Accuracy provides information about the extent to which the model predicts classes correctly. Balanced accuracy becomes relevant if the dataset has a class imbalance, whereas the F1 Score delivers a balance between precision and recall, especially in the case of binary classification (Mahmudah et al, 2021).…”
Section: Figure 4 Data Standardizationmentioning
confidence: 99%
“…Accuracy provides information about the extent to which the model predicts classes correctly. Balanced accuracy becomes relevant if the dataset has a class imbalance, whereas the F1 Score delivers a balance between precision and recall, especially in the case of binary classification (Mahmudah et al, 2021).…”
Section: Figure 4 Data Standardizationmentioning
confidence: 99%
“…When dealing with asymmetric data classification, different evaluation metrics were essential. While accuracy is a common metric for classification [6062], it is not suitable for asymmetric classification [63] because a less effective model can achieve a higher accuracy. Evaluation of expected and predicted class labels or assessment of probabilities for the anticipated class labels were needed for classification issues.…”
Section: 1performance Evaluationmentioning
confidence: 99%
“…Additionally, the AUC analysis was employed to visualize the categorization of the dataset used for label prediction. Equations 1 to 5 provide the mathematical expressions for calculating specificity, sensitivity, BRA, G-mean, and AUC, as follows: Specificity: The percentage of accurately detected negatives over all possible negative forecasts produced by the algorithm is measured by specificity [63]. It is also referred to as the true negative rate.…”
Section: 11metrics For Label Predictionmentioning
confidence: 99%
See 1 more Smart Citation
“…Table 1 presents the summary of prune rules derived for the present scenario. https://www.indjst.org/ List of pruning rules is applied over the feature vector to reduce the difference between positive and negative pairs before it is given to classifiers (11) . During experimentation number of positive instances is increased from 5.32% to 43.95%, whereas the number of negative instances is decreased from 94.68% to 56.05%.…”
Section: Rule -8: Pruning By Distance -Pronoun Type Analysismentioning
confidence: 99%