The main goal of this study is to produce a landslide susceptibility map in the Wanzhou section of the Three Gorges reservoir area (China) with a weighted gradient boosting decision tree (weighted GBDT) model. According to the current research on landslide susceptibility mapping (LSM), the GBDT method is rarely used in LSM. Furthermore, previous studies have rarely considered the imbalance of landslide samples and simply regarded the LSM problem as a binary classification problem. In this paper, we considered LSM as an imbalanced learning problem and obtained a better predictive model using the weighted GBDT method. The innovations of the article mainly include the following two points: introducing the GBDT model into the evaluation of landslide susceptibility; using the weighted GBDT method to deal with the problem of landslide sample imbalance. The logistic regression (LR) model and gradient boosting decision tree (GBDT) model were also used in the study to compare with the weighted GBDT model. Five kinds of data from different data source were used in the study: geology, topography, hydrology, land cover, and triggered factors (rainfall, earthquake, land use, etc.). Twenty nine environmental parameters and 233 landslides were used as input data. The receiver operating characteristic (ROC) curve, the area under the ROC curve (AUC) value, and the recall value were used to estimate the quality of the weighted GBDT model, the GBDT model, and the LR model. The results showed that the GBDT model and the weighted GBDT model had a higher AUC value (0.977, 0.976) than the LR model (0.845); the weighted GBDT model had a little higher AUC value (0.977) than the GBDT model (0.976); and the weighted GBDT model had a higher recall value (0.823) than the GBDT model (0.426) and the LR model (0.004). The weighted GBDT method could be considered to have the best performance considering the AUC value and the recall value in landslide susceptibility mapping dealing with imbalanced landslide data.
A landslide is a type of geological disaster that poses a threat to human lives and property. Landslide susceptibility assessment (LSA) is a crucial tool for landslide prevention. This paper’s primary objective is to compare the performances of conventional shallow machine learning methods and deep learning methods in LSA based on imbalanced data to evaluate the applicability of the two types of LSA models when class-weighted strategies are applied. In this article, logistic regression (LR), random forest (RF), deep fully connected neural network (DFCNN), and long short-term memory (LSTM) neural networks were employed for modeling in the Zigui-Badong area of the Three Gorges Reservoir area, China. Eighteen landslide influence factors were introduced to compare the performance of four models under a class balanced strategy versus a class imbalanced strategy. The Spearman rank correlation coefficient (SRCC) was applied for factor correlation analysis. The results reveal that the elevation and distance to rivers play a dominant role in LSA tasks. It was observed that DFCNN (AUC = 0.87, F1-score = 0.60) and LSTM (AUC = 0.89, F1-score = 0.61) significantly outperformed LR (AUC = 0.89, F1-score = 0.50) and RF (AUC = 0.88, F1-score = 0.50) under the class imbalanced strategy. The RF model achieved comparable outcomes (AUC = 0.90, F1-score = 0.61) to deep learning models under the class balanced strategy and ran at a faster training speed (up to 63 times faster than deep learning models). The LR model performance was inferior to that of the other three models under the balanced strategy. Meanwhile, the deep learning models and the shallow machine learning models showed significant differences in susceptibility spatial patterns. This paper’s findings will aid researchers in selecting appropriate LSA models. It is also valuable for land management policy making and disaster prevention and mitigation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.