Fine knowledge of the spatiotemporal distribution of the population is fundamental in a wide range of fields, including resource management, disaster response, public health, and urban planning. The United Nations’ Sustainable Development Goals also require the accurate and timely assessment of where people live to formulate, implement, and monitor sustainable development policies. However, due to the lack of appropriate auxiliary datasets and effective methodological frameworks, there are rarely continuous multi-temporal gridded population data over a long historical period to aid in our understanding of the spatiotemporal evolution of the population. In this study, we developed a framework integrating a ResNet-N deep learning architecture, considering neighborhood effects with a vast number of Landsat-5 images from Google Earth Engine for population mapping, to overcome both the data and methodology obstacles associated with rapid multi-temporal population mapping over a long historical period at a large scale. Using this proposed framework in China, we mapped fine-scale multi-temporal gridded population data (1 km × 1 km) of China for the 1985–2010 period with a 5-year interval. The produced multi-temporal population data were validated with available census data and achieved comparable performance. By analyzing the multi-temporal population grids, we revealed the spatiotemporal evolution of population distribution from 1985 to 2010 in China with the characteristic of concentration of the population in big cities and the contraction of small- and medium-sized cities. The framework proposed in this study demonstrates the feasibility of mapping multi-temporal gridded population distribution at a large scale over a long period in a timely and low-cost manner, which is particularly useful in low-income and data-poor areas.