The intensive industrial development in special economic zones, such as Thailand’s Eastern Economic Corridor, increases energy consumption, leading to an imbalance of energy supply and a challenge for energy management. Electricity consumption at a local level is crucial for utility planners to manage and invest in the electrical grid. With this study, we propose an electricity consumption estimation model at the district level using machine learning with publicly available statistical data and built-up area (BU), area of lit (AL), and sum of light intensity (SL) data extracted from Landsat 8 and Suomi NPP satellite nighttime light images. The models created from three machine learning algorithms, which included Multiple Linear Regression (MR), Decision Tree (DT), and Support Vector Regression (SVR), were compared. The results show that (1) electricity consumption is highly correlated with SL, AL, and BU; and (2) the DT model demonstrated a better performance in predicting local electricity consumption when compared to MR and SVR with the lowest error rate and highest R2. The local government in developing countries with limited data and financial resources can adopt the proposed approach to benefit from utilizing commonly available remote sensing and statistical data with simple machine learning models such as DT (regression method) for sustainable electricity management.
Abstract. The socioeconomic data, such as household income, is an important indicator of people’s well-being. However, due to the limited resource in many developing countries such as Thailand, the data obtained from household income surveys are often incomplete. As a result, the annual household survey usually contains a gap at the municipality household level. In this study, we aim to quantify the household income with K-NN imputation models at the sub-district level using satellite imageries and geospatial data as proxies to socioeconomic indicators. We examined the role of satellite and geospatial data in household income estimation, applied the K-NN imputation methods to estimate the missing income data by using various geographical and statistical variables, and quantified how these data improved the accuracy of sub-district household income estimation. Our results illustrated a significant correlation between sub-district household income and geographical data extracted from day-night satellite data, such as night light intensity (r = 0.53), urban density (r = 0.44), residential area (r = 0.68), urban area (r = 0.64), and statistical data as well as household expenditure (r = 0.97). These can be used to improve the socioeconomic indicators’ estimation as well as household income in sub-district level. The income imputation from geographical data perform better result than purely statistical variables. Especially, the night light intensity can infer the wealth of people living in large scale areas, while day-time satellite images can be interpreted for land use and land cover also implying socioeconomic status. Such socioeconomic proxy from space provides spatially explicit information in further study.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.