An Ensemble Approach of Feature Selection and Machine Learning Models for Regional Landslide Susceptibility Mapping in the Arid Mountainous Terrain of Southern Peru

This study evaluates the utility of the ensemble framework of feature selection and machine learning (ML) models for regional landslide susceptibility mapping (LSM) in the arid climatic condition of southern Peru. A historical landslide inventory and 24 different landslide influencing factors (LIFs)...

Full description

Bibliographic Details
Main Authors: Chandan Kumar, Gabriel Walton, Paul Santi, Carlos Luza
Format: Article
Language:English
Published: MDPI AG 2023-02-01
Series:Remote Sensing
Subjects:
Online Access:https://www.mdpi.com/2072-4292/15/5/1376
_version_ 1797614415565029376
author Chandan Kumar
Gabriel Walton
Paul Santi
Carlos Luza
author_facet Chandan Kumar
Gabriel Walton
Paul Santi
Carlos Luza
author_sort Chandan Kumar
collection DOAJ
description This study evaluates the utility of the ensemble framework of feature selection and machine learning (ML) models for regional landslide susceptibility mapping (LSM) in the arid climatic condition of southern Peru. A historical landslide inventory and 24 different landslide influencing factors (LIFs) were prepared using remotely sensed and auxiliary datasets. The LIFs were evaluated using multi-collinearity statistics and their relative importance was measured to select the most discriminative LIFs using the ensemble feature selection method, which was developed using Chi-square, gain ratio, and relief-F methods. We evaluated the performance of ten different ML algorithms (linear discriminant analysis, mixture discriminant analysis, bagged cart, boosted logistic regression, k-nearest neighbors, artificial neural network, support vector machine, random forest, rotation forest, and C5.0) using different accuracy statistics (sensitivity, specificity, area under curve (AUC), and overall accuracy (OA)). We used suitable combinations of individual ML models to develop different ensemble ML models and evaluated their performance in LSM. We assessed the impact of LIFs on ML performance. Among all individual ML models, the k-nearest neighbors (sensitivity = 0.72, specificity = 0.82, AUC = 0.86, OA = 78%) and artificial neural network (sensitivity = 0.71, specificity = 0.85, AUC = 0.87, OA = 79%) algorithms showed the best performance using the top five LIFs, while random forest, rotation forest, and C5.0 (sensitivity = 0.76–0.81, specificity = 0.87, AUC = 0.90–0.93, OA = 82–84%) outperformed other models when developed using all twenty-four LIFs. Among ensemble models, the ensemble of k-nearest neighbors and rotation forest, k-nearest neighbors and artificial neural network, and artificial neural network and rotation forest outperformed other models (sensitivity = 0.72–0.73, specificity = 0.83–0.84, AUC = 0.86, OA = 79%) using the top five LIFs. The landslide susceptibility maps derived using these models indicate that ~2–3% and ~10–12% of the total study area fall within the “very high” and “high” susceptibility. The obtained susceptibility maps can be efficiently used to prioritize landslide mitigation activities.
first_indexed 2024-03-11T07:11:05Z
format Article
id doaj.art-094fdccf73154fffac419938ac9426d0
institution Directory Open Access Journal
issn 2072-4292
language English
last_indexed 2024-03-11T07:11:05Z
publishDate 2023-02-01
publisher MDPI AG
record_format Article
series Remote Sensing
spelling doaj.art-094fdccf73154fffac419938ac9426d02023-11-17T08:32:13ZengMDPI AGRemote Sensing2072-42922023-02-01155137610.3390/rs15051376An Ensemble Approach of Feature Selection and Machine Learning Models for Regional Landslide Susceptibility Mapping in the Arid Mountainous Terrain of Southern PeruChandan Kumar0Gabriel Walton1Paul Santi2Carlos Luza3Department of Geology and Geological Engineering, Colorado School of Mines, Golden, CO 80401, USADepartment of Geology and Geological Engineering, Colorado School of Mines, Golden, CO 80401, USADepartment of Geology and Geological Engineering, Colorado School of Mines, Golden, CO 80401, USADepartment of Geology, Geophysics and Mines, Universidad Nacional de San Agustín, Arequipa 04000, PeruThis study evaluates the utility of the ensemble framework of feature selection and machine learning (ML) models for regional landslide susceptibility mapping (LSM) in the arid climatic condition of southern Peru. A historical landslide inventory and 24 different landslide influencing factors (LIFs) were prepared using remotely sensed and auxiliary datasets. The LIFs were evaluated using multi-collinearity statistics and their relative importance was measured to select the most discriminative LIFs using the ensemble feature selection method, which was developed using Chi-square, gain ratio, and relief-F methods. We evaluated the performance of ten different ML algorithms (linear discriminant analysis, mixture discriminant analysis, bagged cart, boosted logistic regression, k-nearest neighbors, artificial neural network, support vector machine, random forest, rotation forest, and C5.0) using different accuracy statistics (sensitivity, specificity, area under curve (AUC), and overall accuracy (OA)). We used suitable combinations of individual ML models to develop different ensemble ML models and evaluated their performance in LSM. We assessed the impact of LIFs on ML performance. Among all individual ML models, the k-nearest neighbors (sensitivity = 0.72, specificity = 0.82, AUC = 0.86, OA = 78%) and artificial neural network (sensitivity = 0.71, specificity = 0.85, AUC = 0.87, OA = 79%) algorithms showed the best performance using the top five LIFs, while random forest, rotation forest, and C5.0 (sensitivity = 0.76–0.81, specificity = 0.87, AUC = 0.90–0.93, OA = 82–84%) outperformed other models when developed using all twenty-four LIFs. Among ensemble models, the ensemble of k-nearest neighbors and rotation forest, k-nearest neighbors and artificial neural network, and artificial neural network and rotation forest outperformed other models (sensitivity = 0.72–0.73, specificity = 0.83–0.84, AUC = 0.86, OA = 79%) using the top five LIFs. The landslide susceptibility maps derived using these models indicate that ~2–3% and ~10–12% of the total study area fall within the “very high” and “high” susceptibility. The obtained susceptibility maps can be efficiently used to prioritize landslide mitigation activities.https://www.mdpi.com/2072-4292/15/5/1376ensemble feature selectionensemble machine learninglandslide susceptibility mappinggeohazardsPeru
spellingShingle Chandan Kumar
Gabriel Walton
Paul Santi
Carlos Luza
An Ensemble Approach of Feature Selection and Machine Learning Models for Regional Landslide Susceptibility Mapping in the Arid Mountainous Terrain of Southern Peru
Remote Sensing
ensemble feature selection
ensemble machine learning
landslide susceptibility mapping
geohazards
Peru
title An Ensemble Approach of Feature Selection and Machine Learning Models for Regional Landslide Susceptibility Mapping in the Arid Mountainous Terrain of Southern Peru
title_full An Ensemble Approach of Feature Selection and Machine Learning Models for Regional Landslide Susceptibility Mapping in the Arid Mountainous Terrain of Southern Peru
title_fullStr An Ensemble Approach of Feature Selection and Machine Learning Models for Regional Landslide Susceptibility Mapping in the Arid Mountainous Terrain of Southern Peru
title_full_unstemmed An Ensemble Approach of Feature Selection and Machine Learning Models for Regional Landslide Susceptibility Mapping in the Arid Mountainous Terrain of Southern Peru
title_short An Ensemble Approach of Feature Selection and Machine Learning Models for Regional Landslide Susceptibility Mapping in the Arid Mountainous Terrain of Southern Peru
title_sort ensemble approach of feature selection and machine learning models for regional landslide susceptibility mapping in the arid mountainous terrain of southern peru
topic ensemble feature selection
ensemble machine learning
landslide susceptibility mapping
geohazards
Peru
url https://www.mdpi.com/2072-4292/15/5/1376
work_keys_str_mv AT chandankumar anensembleapproachoffeatureselectionandmachinelearningmodelsforregionallandslidesusceptibilitymappinginthearidmountainousterrainofsouthernperu
AT gabrielwalton anensembleapproachoffeatureselectionandmachinelearningmodelsforregionallandslidesusceptibilitymappinginthearidmountainousterrainofsouthernperu
AT paulsanti anensembleapproachoffeatureselectionandmachinelearningmodelsforregionallandslidesusceptibilitymappinginthearidmountainousterrainofsouthernperu
AT carlosluza anensembleapproachoffeatureselectionandmachinelearningmodelsforregionallandslidesusceptibilitymappinginthearidmountainousterrainofsouthernperu
AT chandankumar ensembleapproachoffeatureselectionandmachinelearningmodelsforregionallandslidesusceptibilitymappinginthearidmountainousterrainofsouthernperu
AT gabrielwalton ensembleapproachoffeatureselectionandmachinelearningmodelsforregionallandslidesusceptibilitymappinginthearidmountainousterrainofsouthernperu
AT paulsanti ensembleapproachoffeatureselectionandmachinelearningmodelsforregionallandslidesusceptibilitymappinginthearidmountainousterrainofsouthernperu
AT carlosluza ensembleapproachoffeatureselectionandmachinelearningmodelsforregionallandslidesusceptibilitymappinginthearidmountainousterrainofsouthernperu