Introduction Missing values exist widely in mass-spectrometry (MS) based metabolomics data. Various methods have been applied for handling missing values, but the selection of methods can significantly affect following data analyses and interpretations. According to the definition, there are three types of missing values, missing completely at random (MCAR), missing at random (MAR), and missing not at random (MNAR).Objectives The aim of this study was to comprehensively compare common imputation methods for different types of missing values using two separate metabolomics data sets (977 and 198 serum samples respectively) to propose a strategy to deal with missing values in metabolomics studies.Methods Imputation methods included zero, half minimum (HM), mean, median, random forest (RF), singular value decomposition (SVD), k-nearest neighbors (kNN), and quantile regression imputation of left-censored data (QRILC). Normalized root mean squared error (NRMSE) and NRMSE-based sum of ranks (SOR) were applied to evaluate the imputation accuracy for MCAR/MAR and MNAR correspondingly. Principal component analysis (PCA)/partial least squares (PLS)-Procrustes sum of squared error were used to evaluate the overall sample distribution. Student's t-test followed by Pearson correlation analysis was conducted to evaluate the effect of imputation on univariate statistical analysis.Results Our findings demonstrated that RF imputation performed the best for MCAR/MAR and QRILC was the favored one for MNAR.peer-reviewed) is the author/funder. All rights reserved. No reuse allowed without permission.The copyright holder for this preprint (which was not . http://dx.doi.org/10.1101/171967 doi: bioRxiv preprint first posted online Aug. 17, 2017; 4 Conclusion Combining with "modified 80% rule", we proposed a comprehensive strategy and developed a public-accessible web-tool for missing value imputation in metabolomics data.