2022
DOI: 10.1186/s40537-022-00617-z
|View full text |Cite
|
Sign up to set email alerts
|

Improved cost-sensitive representation of data for solving the imbalanced big data classification problem

Abstract: Dimension reduction is a preprocessing step in machine learning for eliminating undesirable features and increasing learning accuracy. In order to reduce the redundant features, there are data representation methods, each of which has its own advantages. On the other hand, big data with imbalanced classes is one of the most important issues in pattern recognition and machine learning. In this paper, a method is proposed in the form of a cost-sensitive optimization problem which implements the process of select… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 12 publications
(2 citation statements)
references
References 39 publications
0
2
0
Order By: Relevance
“…Ramos et al proposed the knowledge of seasonal climate prediction, and the environment of centralized data storage [9]. Fattahi et al considered dimensionality reduction as a preprocessing step in machine learning to remove unwanted features and improve learning accuracy [10]. However, the relevant researches only study the accuracy and efficiency of digital twins, and there are few studies that combine accounting information with digital twins.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Ramos et al proposed the knowledge of seasonal climate prediction, and the environment of centralized data storage [9]. Fattahi et al considered dimensionality reduction as a preprocessing step in machine learning to remove unwanted features and improve learning accuracy [10]. However, the relevant researches only study the accuracy and efficiency of digital twins, and there are few studies that combine accounting information with digital twins.…”
Section: Related Workmentioning
confidence: 99%
“…e known Mahalanobis distance, Chebyshev distance, Minkowski distance, etc. can be used to calculate the expression for the distance between any two samples, Minkowski distance and Mahalanobis distance are as shown in formulas (10) and ( 11): e Minkowski distance is:…”
Section: Missing Data Filling Algorithm Of K-means Clusteringmentioning
confidence: 99%