Spatio-temporal forecasting is of great importance in a wide range of dynamical systems applications, such as earth science, transport planning, etc. These applications rely on accurate predictions of spatio-temporal structured data reflecting real-world phenomena. A stunning characteristic is that the dynamical system is not only driven by some physics laws but also impacted by the localized factor in spatial and temporal regions. One of the major challenges is to infer the underlying causes, which generate the perceived data stream and propagate the involved causal dynamics through the distributed observing units. Another challenge is that the success of machine learning based predictive models requires massive annotated data for model training. However, the acquisition of high-quality annotated data is objectively manual and tedious as it needs a considerable amount of human intervention, making it infeasible in fields that require high levels of expertise. To tackle these challenges, we advocate a spatio-temporal physicscoupled neural networks (ST-PCNN) model to learn the underlying physics of the dynamical system and further couple the learned physics to assist the learning of the recurring dynamics. To deal with data-acquisition constraints, an active learning mechanism with Kriging for actively acquiring the most informative data is proposed for ST-PCNN training in a partially observable environment. Our experiments on both synthetic and real-world datasets exhibit that the proposed ST-PCNN with active learning converges to near optimal accuracy with substantially fewer instances.