2011
DOI: 10.3844/jcssp.2011.216.224
|View full text |Cite
|
Sign up to set email alerts
|

Predicting Missing Attribute Values Using k-Means Clustering

Abstract: Problem statement: Predicting the value for missing attributes is an important data preprocessing problem in data mining and knowledge discovery tasks. Several methods have been proposed to treat missing data and the one used more frequently is deleting instances containing at least one missing value of a feature. When the dataset has minimum number of missing attribute values then we can neglect the instances. But if it is high, deleting those instances may neglect the essential information. Some methods, suc… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
13
0

Year Published

2011
2011
2021
2021

Publication Types

Select...
4
4
1

Relationship

0
9

Authors

Journals

citations
Cited by 20 publications
(13 citation statements)
references
References 21 publications
0
13
0
Order By: Relevance
“…Table 1 There are a lot of approaches to deal with missing data. In some cases, deletion or elimination the missing variable is the default method for most procedures (Suguna and Thanuskodi, 2011). However in time series regression, this approach seems like not the best methods to be used.…”
Section: Treatment Of Missing Datamentioning
confidence: 99%
See 1 more Smart Citation
“…Table 1 There are a lot of approaches to deal with missing data. In some cases, deletion or elimination the missing variable is the default method for most procedures (Suguna and Thanuskodi, 2011). However in time series regression, this approach seems like not the best methods to be used.…”
Section: Treatment Of Missing Datamentioning
confidence: 99%
“…It has been reported that the results using Radial Basis Function (RBF) neural network is better than Back Propagation Neural Network (BPNN) in modeling a meteorological problems such as weather forecasting. In addition, there was analysis shows that pre-processing data analysis also can influenced the performance of prediction model (Zhang, 2002;Suguna and Thanuskodi, 2011). From previous work, the method of BPNN is better compared to SARIMA in obtaining water level prediction at Dungun River, Terengganu (Arbain and Wibowo, 2012).…”
Section: Introductionmentioning
confidence: 98%
“…It will often be desirable to choose a subset of all the features available, to reduce the dimensionality of the problem space. This step often requires a good deal of domain knowledge and data analysis (Suguna et al, 2011, Rao, 2003.…”
Section: Clustering Stepsmentioning
confidence: 99%
“…Another category of methods to handle missing data consists of machine learning techniques such as multi-layer perceptrons (MLP), self-organizing maps (SOM) k-nearest neighbour (kNN) [15,31], decision trees [27] and Linear Discriminant Analysis (LDA) [1]. Moreover, there are various techniques, which are used to handle missing data in software cost estimation models such as mean imputation, list-wise deletion (LD), kNN, and Expectation Maximization (EM) algorithms.…”
Section: Missing Data Problem In Predic-tion Modelsmentioning
confidence: 99%