2023
DOI: 10.1186/s40537-022-00679-z
|View full text |Cite
|
Sign up to set email alerts
|

Smoothing target encoding and class center-based firefly algorithm for handling missing values in categorical variable

Abstract: One of the most common causes of incompleteness is missing data, which occurs when no data value for the variables in observation is stored. An adaptive approach model outperforming other numerical methods in the classification problem was developed using the class center-based Firefly algorithm by incorporating attribute correlations into the imputation process (C3FA). However, this model has not been tested on categorical data, which is essential in the preprocessing stage. Encoding is used to convert text o… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
1

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(1 citation statement)
references
References 56 publications
0
1
0
Order By: Relevance
“…The largest cluster (#0) has 41 members and a silhouette value of 0.78. It is labeled as missing value by both log-likelihood ratios (LLR) and Latent Semantic Indexing (LSI), and as analysis ( Some study opportunities that received less attention comprised missing data and the class center method, as initially explored by Nugroho et al [169]- [172], building upon the work initiated by Tsai et al [173]. More recently, the GAN method for data imputation was developed and studied [133], [137], [147], [148], [174]- [200].…”
Section: Figure 17 Co-occurrence Networkmentioning
confidence: 99%
“…The largest cluster (#0) has 41 members and a silhouette value of 0.78. It is labeled as missing value by both log-likelihood ratios (LLR) and Latent Semantic Indexing (LSI), and as analysis ( Some study opportunities that received less attention comprised missing data and the class center method, as initially explored by Nugroho et al [169]- [172], building upon the work initiated by Tsai et al [173]. More recently, the GAN method for data imputation was developed and studied [133], [137], [147], [148], [174]- [200].…”
Section: Figure 17 Co-occurrence Networkmentioning
confidence: 99%