“…Examining, more broadly, data imputation techniques, initial attempts simply replaced missing data with global statistics [19,20,21], though recent efforts are exploring probabilistic and machine learning methods to learn from the existing observed patterns in the incomplete data. Examples include k-nearest neighbor (KNN) methods [24,25], Support Vector Machine applications (SVN) [26], Matrix completion and factorization [27,28,29] and MissForest [30,31] approaches, Principal component analysis (PCA) [38,39,40], Kriging-based [32,33,34] or Gaussian Process (GP) [4,5] methods. The latter family (Kriging and GP) are particularly attractive for spatio-temporal problems, like the one considered here, though they might face few important challenges: a) to efficiently handle large datasets (many nodes and many time instances) some covariance approximation/simplification will be needed [35,36,37] that might reduce predictive accuracy; b) approach assumes correlation of surge between all nodes in close distance to oneanother, which might not be the case for all near-shore coastal regions, since complex local geomorphologies (for example existence of barriers or riverine systems) might change the storm inundation characteristics even for nodes in geographic close proximity; c) missing data for storm surge imputation is not randomly distributed in space and time, rather it appears in structured format as will be shown later, with substantial part of nodes in the same geographical domain remaining dry for same time period, providing challenges in the calibration (proper selection of length and temporal correlation scales).…”