This study investigates the performance of six machine learning (ML) models – Random Forest (RF), Adaptive Boosting (ADA), CatBoost (CAT), Support Vector Machine (SVM), Lasso Regression (LAS), and Artificial Neural Network (ANN) – against traditional empirical formulas for estimating maximum scour depth after sluice gates. Our findings indicate that ML models generally outperform empirical formulas, with correlation coefficients (CORR) ranging from 0.882 to 0.944 for ML models compared with 0.835–0.847 for empirical methods. Notably, ANN exhibited the highest performance, followed closely by CAT, with a CORR of 0.936. RF, ADA, and SVM performed competitive metrics around 0.928. Variable importance assessments highlighted the dimensionless densimetric Froude number (Fd) as significantly influential, particularly in RF, CAT, and LAS models. Furthermore, SHAP value analysis provided insights into each predictor's impact on model outputs. Uncertainty assessment through Monte Carlo (MC) and Bootstrap (BS) methods, with 1,000 iterations, indicated ML's capability to produce reliable uncertainty maps. ANN leads in performance with higher mean values and lower standard deviations, followed by CAT. MC results trend towards optimistic predictions compared with BS, as reflected in median values and interquartile ranges. This analysis underscores the efficacy of ML models in providing precise and reliable scour depth predictions.