Ensemble learning analysis of influencing factors on the distribution of urban flood risk points: a case study of Guangzhou, China

Zhao, Juchao; Jin, Wang; Abbas, Zaheer; Yang, Yongchun; Zhao, Yaolong

doi:10.3389/feart.2023.1042088

Cited by 8 publications

(1 citation statement)

References 63 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This is evident when experts emphasize elevation, for instance, where it becomes a dominant factor in the model, leading to flood susceptibility maps that closely resemble the area's DEM [35]. Similarly, when drainage capacity is emphasized, it dominates, resulting in maps closely aligned with the distribution of drainage capacity [36]. This leads to ambiguity in flood susceptibility assessments for the same area.…”

Section: Introductionmentioning

confidence: 99%

Flood Susceptibility Assessment with Random Sampling Strategy in Ensemble Learning (RF and XGBoost)

Ren,

Pang,

Bai

et al. 2024

Remote Sensing

View full text Add to dashboard Cite

Due to the complex interaction of urban and mountainous floods, assessing flood susceptibility in mountainous urban areas presents a challenging task in environmental research and risk analysis. Data-driven machine learning methods can evaluate flood susceptibility in mountainous urban areas lacking essential hydrological data, utilizing remote sensing data and limited historical inundation records. In this study, two ensemble learning algorithms, Random Forest (RF) and XGBoost, were adopted to assess the flood susceptibility of Kunming, a typical mountainous urban area prone to severe flood disasters. A flood inventory was created using flood observations from 2018 to 2022. The spatial database included 10 explanatory factors, encompassing climatic, geomorphic, and anthropogenic factors. Artificial Neural Network (ANN) and Support Vector Machine (SVM) were selected for model comparison. To minimize the influence of expert opinions on model training, this study employed a strategy of uniformly random sampling in historically non-flooded areas for negative sample selection. The results demonstrated that (1) ensemble learning algorithms offer higher accuracy than other machine learning methods, with RF achieving the highest accuracy, evidenced by an area under the curve (AUC) of 0.87, followed by XGBoost at 0.84, surpassing both ANN (0.83) and SVM (0.82); (2) the interpretability of ensemble learning highlighted the differences in the potential distribution of the training data’s positive and negative samples. Feature importance in ensemble learning can be utilized to minimize human bias in the collection of flooded-site samples, more targeted flood susceptibility maps of the study area’s road network were obtained; and (3) ensemble learning algorithms exhibited greater stability and robustness in datasets with varied negative samples, as evidenced by their performance in F1-Score, Kappa, and AUC metrics. This paper further substantiates the superiority of ensemble learning in flood susceptibility assessment tasks from the perspectives of accuracy, interpretability, and robustness, enhances the understanding of the impact of negative samples on such assessments, and optimizes the specific process for urban flood susceptibility assessment using data-driven methods.

show abstract

Section: Introductionmentioning

confidence: 99%