Abstract. In spatio-temporal data analysis, dimension reduction is necessary to extract intrinsic structures and to avoid over-parametrization problems. The spatial dynamic factor model (SDFM) reduces dimension of the data by decomposing them into spatial and temporal variations. The spatial variation is represented by a few spatially structured vectors, called factor loading vectors. The SDFM cannot be directly applied when the data contain missing values and their observation sites vary with time. We extend the factor loading vector to a smooth continuous function obtained by basis expansion, where we call the extended model the spatially continuous dynamic factor model (SCDFM), and estimate the SCDFM using the maximum L 2 penalized likelihood method. We derive model selection criteria to select a regularization parameter and the number of factors. Applications to synthetic and real data show the effectiveness of our modeling strategy in terms of estimation accuracy and stability.
IntroductionIn many scientific and industrial fields, spatio-temporal data, which depend on time and space, are often observed. The amount of spatio-temporal data are increasing as measurement devices continue to be developed, thereby increasing the importance of statistical modeling of the spatio-temporal data [1]. In spatio-temporal data analysis, when the data at all observation sites are directly modeled by multivariate time series models such as the vector auto-regression moving average model [2], over-parametrization happens and complex models that are difficult to interpret are obtained. In order to avoid over-parametrization, the space-time autoregressive moving average model [3], where many parameters are fixed by topographical information, has been proposed. However, this model is not flexible enough to capture spatio-temporal structures.For this reason, dimension reduction for the spatio-temporal data is necessary to extract their inherent and essential structures. The spatial dynamic factor model (SDFM; [4,5,6]) is a dynamic factor model [7,8] extended to capture spatial structures. The SDFM reduces dimension of the spatio-temporal data by estimating spatial and temporal variations of the data based on the Bayesian approach. The spatial variation is represented by a few spatially correlated vectors, called factor loading (FL) vectors, which are assumed to follow a Gaussian process. The temporal variation is represented by autoregressive processes of the factors, called factor processes, which are the stochastic processes corresponding to the FL vectors. The estimated FL vectors and factor processes reveal the spatial, temporal and spatio-temporal structures of the data.