Prediction tunnel settlement in shield tunnels during the operation period has gained increasing significance within the realm of maintenance strategy formulation. The sparse settlement data during this period present a formidable challenge for predictive Artificial Intelligence (AI) models, as they may not handle non-stationary relationships effectively or have the risk of overfitting. In this study, we propose an improved machine learning (ML) model based on sparse settlement data. We enhance training data via time series clustering, use time decomposition to uncover latent features, and employ Extreme Gradient Boosting (XGBoost) v1.5.1 with Bayesian Optimization (BO) v1.2.0 for precise predictions. Comparative experiments conducted on different acquisition points substantiate our model’s efficacy, the in-training set yielding a Mean Absolute Error (MAE) of 0.649 mm, Root Mean Square Error (RMSE) of 0.873 mm, Mean Absolute Percentage Error (MAPE) of 3.566, and Coefficient of Determination (R2) of 0.872, and the in-testing set yielding a MAE of 0.717 mm, RMSE of 1.048 mm, MAPE of 4.080, and R2 of 0.846. The empirical results show the superiority of the proposed model compared to simple ML models and a complex neural network model, as it has a lower prediction error and higher accuracy across different sparse settlement datasets. Moreover, this paper underlines that accurate settlement predictions contribute to achieving some Sustainable Development Goals (SDGs). Specifically, preventive tunnel maintenance strategies based on predictive results can enhance tunnels’ long-term operational reliability, which is in accordance with SDG 9 (Industry, Innovation, and Infrastructure) and SDG 11 (Sustainable Cities and Communities).