Spatio-temporal prediction is a key type of tasks in urban computing, e.g., traffic flow and air quality. Adequate data is usually a prerequisite, especially when deep learning is adopted. However, the development levels of different cities are unbalanced, and still many cities suffer from data scarcity. To address the problem, we propose a novel cross-city transfer learning method for deep spatiotemporal prediction tasks, called RegionTrans. RegionTrans aims to effectively transfer knowledge from a data-rich source city to a data-scarce target city. More specifically, we first learn an inter-city region matching function to match each target city region to a similar source city region. A neural network is designed to effectively extract region-level representation for spatio-temporal prediction. Finally, an optimization algorithm is proposed to transfer learned features from the source city to the target city with the region matching function. Using citywide crowd flow prediction as a demonstration experiment, we verify the effectiveness of RegionTrans. Results show that RegionTrans can outperform the state-of-theart fine-tuning deep spatio-temporal prediction models by reducing up to 10.7% prediction error. * Equal contribution Preprint. Work in progress.In literature, existing deep learning approaches are often designed to predict citywide phenomenon as a whole [18,19], and thus it is hard to enable region-level knowledge transfer. To this end, rather than adopting the existing deep neural networks for citywide spatio-temporal prediction (e.g. ), we propose a novel deep transfer learning method. First, we design a region matching function to link each target city region to a similar source region based on the short period of service data or correlated auxiliary data if applicable. Then, in our proposed network structure, to catch the spatio-temporal patterns hidden in the service data, ConvLSTM layers [12] are firstly stacked. Afterward, to encode region representation, we newly add a Conv2D layer with 1 × 1 filter, which is the key and fundamental component of our network to make region-level transfer feasible. Finally, the discrepancy between region representations of the inter-city similar regions is minimized during the network parameter learning for the target city, so as to enable region-level cross-city knowledge transfer. With crowd flow prediction as a showcase [18,19], we verify the feasibility and effectiveness of RegionTrans. Briefly, this paper has the following contributions.(i) To the best of our knowledge, this is the first work to study how to facilitate deep spatio-temporal prediction in a data-scarce target city by transferring knowledge from a data-rich source city.(ii) We propose a novel deep transfer learning method RegionTrans for spatio-temporal prediction tasks by region-level cross-city transfer. RegionTrans first computes inter-city region similarities, and then stacks ConvLSTM and Conv2D (1 × 1 filter) layers to extract region-level representations reflecting spatio-temporal pattern...