Potential of Ensemble Learning to Improve Tree-Based Classifiers for Landslide Susceptibility Mapping

Ensemble learning methods have been widely used due to their remarkable generalized performance, but their potential in landslide spatial prediction application is not fully studied. To take full advantage of ensemble learning techniques, the classification and regression tree classifier and four tr...

Full description

Bibliographic Details
Main Authors: Jiahui Song, Yi Wang, Zhice Fang, Ling Peng, Haoyuan Hong
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9157927/
Description
Summary:Ensemble learning methods have been widely used due to their remarkable generalized performance, but their potential in landslide spatial prediction application is not fully studied. To take full advantage of ensemble learning techniques, the classification and regression tree classifier and four tree-based ensemble classifiers of random forest, extremely randomized tree, gradient boosting decision trees, and extreme gradient boosting decision trees are used in this study for landslide susceptibility assessment. Specifically, a stacking ensemble learning framework coupled with embedded feature selection is presented, consisting of multiple tree-based classifiers mentioned previously as base learners and logistic regression as a metalearner in a two-layer structure. In the study area of Yongxin, China, 364 historical landslide locations were first randomly partitioned into a ratio of 7/3 for training and testing the model. Then, a spatial database of 16 landslide causative factors was constructed for landslide prediction. Meanwhile, the relative importance of these factors were quantified by using the total number of feature splits and the average Gini index during the training process, and a novel embedded feature selection method was used in the base learner of the proposed framework to further improve the computational efficiency and predictive performance by allowing each base learner to obtain its own optimal subfeature space. Finally, different methods were assessed by using several evaluation criteria. Experimental results demonstrated that the proposed ensemble learning framework had the highest area under the curve value of 0.864, and it is more effective than the conventional tree-based classifiers and other ensemble learning methods.
ISSN:2151-1535