The objective of this research is introduce a new machine learning ensemble approach that is a hybridization of Bagging ensemble (BE) and Logistic Model Trees (LMTree), named as BE-LMtree, for improving the performance of the landslide susceptibility model. The LMTree is a relatively new machine learning algorithm that was rarely explored for landslide study, whereas BE is an ensemble framework that has proven highly efficient for landslide modeling. Upper Reaches Area of Red River Basin (URRB) in Northwest region of Viet Nam was employed as a case study. For this work, a GIS database for the URRB area has been established, which contains a total of 255 landslide polygons and eight predisposing factors i.e., slope, aspect, elevation, land cover, soil type, lithology, distance to fault, and distance to river. The database was then used to construct and validate the proposed BE-LMTree model. Quality of the final BE-LMTree model was checked using confusion matrix and a set of statistical measures. The result showed that the performance of the proposed BE-LMTree model is high with the classification accuracy is 93.81% on the training dataset and the prediction capability is 83.4% on the on the validation dataset. When compared to the support vector machine model and the LMTree model, the proposed BE-LMTree model performs better; therefore, we concluded that the BE-LMTree could prove to be a new efficient tool that should be used for landslide modeling. This research could provide useful results for landslide modeling in landslide prone areas.3 of 22 BE and LMTree has resulted in a new powerful prediction method, and to the best of our knowledge, this is the first time that the BE-LMTree is studied for landslide susceptibility.
Theoretical Background of the Methods
Logistic Model TreeLogistic Model Trees (LMTree), which is a relatively new machine learning algorithm, is developed based on the integration of tree induction algorithm and additive logistic regression [52]. The difference of LMTree when compared to the other decision tree algorithms is that the tree growing process is carried out using the LogitBoost algorithm [52,55] and the tree pruning is performed using Classification And Regression Tree (CART) [56].Given a training dataset T = (x i , y i ) ds i=1 with x i ∈ R D is the input vector, ds is the number of data samples, D is the dimension of the training dataset, and y i ∈ (1, 0) is the label class. In this research context, the input vector consists of eight variables (slope, aspect, elevation, land cover, soil type, lithology, distance to fault, and distance to river), whereas the label class contains two classes, landslide (LS) and non-landslide (Non-LS). The landslide class is coded as "1" and the non-landslide is coded as "0". The objective of LMTree is to construct a tree-like structure model that is capable of classifying the training dataset into the two above classes in term of probability. The predicted numeric value to the landslide class of sample is used as susceptibility index.Structurally, ...