An Ensemble Approach of Feature Selection and Machine Learning Models for Regional Landslide Susceptibility Mapping in the Arid Mountainous Terrain of Southern Peru

This study evaluates the utility of the ensemble framework of feature selection and machine learning (ML) models for regional landslide susceptibility mapping (LSM) in the arid climatic condition of southern Peru. A historical landslide inventory and 24 different landslide influencing factors (LIFs)...

Full description

Bibliographic Details
Main Authors: Chandan Kumar, Gabriel Walton, Paul Santi, Carlos Luza
Format: Article
Language:English
Published: MDPI AG 2023-02-01
Series:Remote Sensing
Subjects:
Online Access:https://www.mdpi.com/2072-4292/15/5/1376
Description
Summary:This study evaluates the utility of the ensemble framework of feature selection and machine learning (ML) models for regional landslide susceptibility mapping (LSM) in the arid climatic condition of southern Peru. A historical landslide inventory and 24 different landslide influencing factors (LIFs) were prepared using remotely sensed and auxiliary datasets. The LIFs were evaluated using multi-collinearity statistics and their relative importance was measured to select the most discriminative LIFs using the ensemble feature selection method, which was developed using Chi-square, gain ratio, and relief-F methods. We evaluated the performance of ten different ML algorithms (linear discriminant analysis, mixture discriminant analysis, bagged cart, boosted logistic regression, k-nearest neighbors, artificial neural network, support vector machine, random forest, rotation forest, and C5.0) using different accuracy statistics (sensitivity, specificity, area under curve (AUC), and overall accuracy (OA)). We used suitable combinations of individual ML models to develop different ensemble ML models and evaluated their performance in LSM. We assessed the impact of LIFs on ML performance. Among all individual ML models, the k-nearest neighbors (sensitivity = 0.72, specificity = 0.82, AUC = 0.86, OA = 78%) and artificial neural network (sensitivity = 0.71, specificity = 0.85, AUC = 0.87, OA = 79%) algorithms showed the best performance using the top five LIFs, while random forest, rotation forest, and C5.0 (sensitivity = 0.76–0.81, specificity = 0.87, AUC = 0.90–0.93, OA = 82–84%) outperformed other models when developed using all twenty-four LIFs. Among ensemble models, the ensemble of k-nearest neighbors and rotation forest, k-nearest neighbors and artificial neural network, and artificial neural network and rotation forest outperformed other models (sensitivity = 0.72–0.73, specificity = 0.83–0.84, AUC = 0.86, OA = 79%) using the top five LIFs. The landslide susceptibility maps derived using these models indicate that ~2–3% and ~10–12% of the total study area fall within the “very high” and “high” susceptibility. The obtained susceptibility maps can be efficiently used to prioritize landslide mitigation activities.
ISSN:2072-4292