Proceedings of the 2015 Annual Conference on Genetic and Evolutionary Computation 2015
DOI: 10.1145/2739480.2754665
|View full text |Cite
|
Sign up to set email alerts
|

Multiple Imputation for Missing Data Using Genetic Programming

Abstract: Missing values are a common problem in many real world databases. Inadequate handing of missing data can lead to serious problems in data analysis. A common way to cope with this problem is to use imputation methods to fill missing values with plausible values. This paper proposes GPMI, a multiple imputation method that uses genetic programming as a regression method to estimate missing values. Experiments on eight datasets with six levels of missing values compare GPMI with seven other popular and advanced im… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
20
0

Year Published

2016
2016
2022
2022

Publication Types

Select...
5
2
1

Relationship

1
7

Authors

Journals

citations
Cited by 28 publications
(20 citation statements)
references
References 29 publications
0
20
0
Order By: Relevance
“…In [2,4] several algorithms of supervised classification are considered, such as Hotdeck, KNN, and Decision Trees. These methods require a labeled training set of data and they use similarity metrics to define the relations among the elements to classify.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…In [2,4] several algorithms of supervised classification are considered, such as Hotdeck, KNN, and Decision Trees. These methods require a labeled training set of data and they use similarity metrics to define the relations among the elements to classify.…”
Section: Related Workmentioning
confidence: 99%
“…Under this context, this paper analyses the problem of text classification, where an algorithm classifies the observed text values within different categories. Existing literature proposes different machine learning and data mining algorithms to solve classification problems in a supervised way [2,3,4]. Algorithms such as K-Nearest-Neighbors (KNN), where text distance metrics can be used to classify elements, are relevant for documents classification problems [3,5].…”
Section: Introductionmentioning
confidence: 99%
“…GPMI is a Genetic Programming (GP) algorithm for multiple imputations that was proposed by Tran et. al [11]. GPMI uses GP as a non-parametric regression method to build mathematical functions that regress missing values of one feature on other features under the control of prediction capability and classification accuracy during the evaluation process.…”
Section: Recent Related Workmentioning
confidence: 99%
“…Table 3 shows the atoms factors of the used performance metrics: True Positive (TP), False Positive (FP), False Negative (FN), True Negative (TN), Positive (P), and Negative (N) instances [24]. using TP, TN, FP, and FN as given in Equations (9), (10), (11) and (12) respectively [25]. `…”
Section: B Evaluation Metricsmentioning
confidence: 99%
“…Results show that the multiple imputation outperforms single imputation such as mean/mode, regression and hot deck imputation methods. [159] proposes GPMI, a multiple imputation method that uses genetic programming as a regression method to estimate missing values. Experiments on eight datasets with six levels of missing values compare GPMI with seven other popular and advanced imputation methods on two measures: the prediction accuracy and the classification accuracy.…”
Section: Imputation For Classification With Incomplete Datamentioning
confidence: 99%