Self-organizing map (SOM) is applied to deal with missing daily rainfall data with different rainfall patterns in Peninsular Malaysia. In this study, stations from Damansara and Kelantan are focused and aimed to evaluate the effectiveness of SOM in clustering and imputation of missing data. The missing data that are imputed by SOM are evaluated by computing the mean square error (MSE) and coefficient correlation(R). Besides, the effects of the imputed data to the mean and variance of the rainfall data is also been observed. The clustering analysis showed that all the stations in Damansara are grouped distinctively, and having a good and even distribution of rain intensity as compared to Kelantan. Meanwhile it is also found that SOM is an excellent tool in estimation of missing data.
This paper presents a study on the estimation of missing data. Data samples with different missingness mechanism namely Missing Completely At Random (MCAR), Missing At Random (MAR) and Missing Not At Random (MNAR) are simulated accordingly. Expectation maximization (EM) algorithm and mean imputation (MI) are applied to these data sets and compared and the performances are evaluated by the mean absolute error (MAE) and root mean square error (RMSE). The results showed that EM is able to estimate the missing data with minimum errors compared to mean imputation (MI) for the three missingness mechanisms. However the graphical results showed that EM failed to estimate the missing values in the missing quadrants when the situation is MNAR.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.