Abstract. Severe airborne particulate matter (PM, including PM2.5 and PM10) pollution in India has caused widespread concern. Accurate PM datasets are fundamental for scientific policymaking and health impact assessment, while surface observations in India are limited due to scarce sites and uneven distribution. In this work, a simple structured, efficient, and robust model based on the Light Gradient Boosting Machine (LightGBM) was developed to fuse multi-source data and estimate long-term (1980–2022) historical daily ground PM datasets in India (LongPMInd). The LightGBM model shows good accuracy with out-of-sample, out-of-site, and out-of-year cross-validation CV test R2 of 0.77, 0.70, and 0.66, respectively. Small performance gaps between PM2.5 training and testing (delta RMSE of 1.06, 3.83, and 7.74 μg m-3) indicate low overfitting risks. With great generalization ability, the open-accessible, long-term, and high-quality daily PM2.5 and PM10 products were then reconstructed (10 km, 1980–2022). It shows that India has experienced severe PM pollution in the Indo-Gangetic Plain (IGP), especially in winter. PM concentrations significantly increased (p<0.05) in most regions since 2000 (0.34 μg m-3 year-1). The turning point occurred in 2018 when the Indian government launched the National Clean Air Program, PM2.5 concentrations declined in most regions (- 0.78 μg m-3 year-1) during 2018–2022. Severe PM2.5 pollution caused continuous increased attributable premature mortalities, from 0.73 (95 % CI: 0.65–0.80) million in 2000 to 1.22 (95 % CI: 1.03–1.41) million in 2019, particularly in the IGP, where attributable mortality increased from 0.36 to 0.60 million. The LongPMInd datasets have the potential to support multi-applications of air quality management, public health, and climate change. The daily and monthly PM2.5 and PM10 datasets are publicly accessible at https://doi.org/10.5281/zenodo.10073944 (Wang et al., 2023a).