An Ensemble Approach of Feature Selection and Machine Learning Models for Regional Landslide Susceptibility Mapping in the Arid Mountainous Terrain of Southern Peru
This study evaluates the utility of the ensemble framework of feature selection and machine learning (ML) models for regional landslide susceptibility mapping (LSM) in the arid climatic condition of southern Peru. A historical landslide inventory and 24 different landslide influencing factors (LIFs)...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-02-01
|
Series: | Remote Sensing |
Subjects: | |
Online Access: | https://www.mdpi.com/2072-4292/15/5/1376 |
_version_ | 1797614415565029376 |
---|---|
author | Chandan Kumar Gabriel Walton Paul Santi Carlos Luza |
author_facet | Chandan Kumar Gabriel Walton Paul Santi Carlos Luza |
author_sort | Chandan Kumar |
collection | DOAJ |
description | This study evaluates the utility of the ensemble framework of feature selection and machine learning (ML) models for regional landslide susceptibility mapping (LSM) in the arid climatic condition of southern Peru. A historical landslide inventory and 24 different landslide influencing factors (LIFs) were prepared using remotely sensed and auxiliary datasets. The LIFs were evaluated using multi-collinearity statistics and their relative importance was measured to select the most discriminative LIFs using the ensemble feature selection method, which was developed using Chi-square, gain ratio, and relief-F methods. We evaluated the performance of ten different ML algorithms (linear discriminant analysis, mixture discriminant analysis, bagged cart, boosted logistic regression, k-nearest neighbors, artificial neural network, support vector machine, random forest, rotation forest, and C5.0) using different accuracy statistics (sensitivity, specificity, area under curve (AUC), and overall accuracy (OA)). We used suitable combinations of individual ML models to develop different ensemble ML models and evaluated their performance in LSM. We assessed the impact of LIFs on ML performance. Among all individual ML models, the k-nearest neighbors (sensitivity = 0.72, specificity = 0.82, AUC = 0.86, OA = 78%) and artificial neural network (sensitivity = 0.71, specificity = 0.85, AUC = 0.87, OA = 79%) algorithms showed the best performance using the top five LIFs, while random forest, rotation forest, and C5.0 (sensitivity = 0.76–0.81, specificity = 0.87, AUC = 0.90–0.93, OA = 82–84%) outperformed other models when developed using all twenty-four LIFs. Among ensemble models, the ensemble of k-nearest neighbors and rotation forest, k-nearest neighbors and artificial neural network, and artificial neural network and rotation forest outperformed other models (sensitivity = 0.72–0.73, specificity = 0.83–0.84, AUC = 0.86, OA = 79%) using the top five LIFs. The landslide susceptibility maps derived using these models indicate that ~2–3% and ~10–12% of the total study area fall within the “very high” and “high” susceptibility. The obtained susceptibility maps can be efficiently used to prioritize landslide mitigation activities. |
first_indexed | 2024-03-11T07:11:05Z |
format | Article |
id | doaj.art-094fdccf73154fffac419938ac9426d0 |
institution | Directory Open Access Journal |
issn | 2072-4292 |
language | English |
last_indexed | 2024-03-11T07:11:05Z |
publishDate | 2023-02-01 |
publisher | MDPI AG |
record_format | Article |
series | Remote Sensing |
spelling | doaj.art-094fdccf73154fffac419938ac9426d02023-11-17T08:32:13ZengMDPI AGRemote Sensing2072-42922023-02-01155137610.3390/rs15051376An Ensemble Approach of Feature Selection and Machine Learning Models for Regional Landslide Susceptibility Mapping in the Arid Mountainous Terrain of Southern PeruChandan Kumar0Gabriel Walton1Paul Santi2Carlos Luza3Department of Geology and Geological Engineering, Colorado School of Mines, Golden, CO 80401, USADepartment of Geology and Geological Engineering, Colorado School of Mines, Golden, CO 80401, USADepartment of Geology and Geological Engineering, Colorado School of Mines, Golden, CO 80401, USADepartment of Geology, Geophysics and Mines, Universidad Nacional de San Agustín, Arequipa 04000, PeruThis study evaluates the utility of the ensemble framework of feature selection and machine learning (ML) models for regional landslide susceptibility mapping (LSM) in the arid climatic condition of southern Peru. A historical landslide inventory and 24 different landslide influencing factors (LIFs) were prepared using remotely sensed and auxiliary datasets. The LIFs were evaluated using multi-collinearity statistics and their relative importance was measured to select the most discriminative LIFs using the ensemble feature selection method, which was developed using Chi-square, gain ratio, and relief-F methods. We evaluated the performance of ten different ML algorithms (linear discriminant analysis, mixture discriminant analysis, bagged cart, boosted logistic regression, k-nearest neighbors, artificial neural network, support vector machine, random forest, rotation forest, and C5.0) using different accuracy statistics (sensitivity, specificity, area under curve (AUC), and overall accuracy (OA)). We used suitable combinations of individual ML models to develop different ensemble ML models and evaluated their performance in LSM. We assessed the impact of LIFs on ML performance. Among all individual ML models, the k-nearest neighbors (sensitivity = 0.72, specificity = 0.82, AUC = 0.86, OA = 78%) and artificial neural network (sensitivity = 0.71, specificity = 0.85, AUC = 0.87, OA = 79%) algorithms showed the best performance using the top five LIFs, while random forest, rotation forest, and C5.0 (sensitivity = 0.76–0.81, specificity = 0.87, AUC = 0.90–0.93, OA = 82–84%) outperformed other models when developed using all twenty-four LIFs. Among ensemble models, the ensemble of k-nearest neighbors and rotation forest, k-nearest neighbors and artificial neural network, and artificial neural network and rotation forest outperformed other models (sensitivity = 0.72–0.73, specificity = 0.83–0.84, AUC = 0.86, OA = 79%) using the top five LIFs. The landslide susceptibility maps derived using these models indicate that ~2–3% and ~10–12% of the total study area fall within the “very high” and “high” susceptibility. The obtained susceptibility maps can be efficiently used to prioritize landslide mitigation activities.https://www.mdpi.com/2072-4292/15/5/1376ensemble feature selectionensemble machine learninglandslide susceptibility mappinggeohazardsPeru |
spellingShingle | Chandan Kumar Gabriel Walton Paul Santi Carlos Luza An Ensemble Approach of Feature Selection and Machine Learning Models for Regional Landslide Susceptibility Mapping in the Arid Mountainous Terrain of Southern Peru Remote Sensing ensemble feature selection ensemble machine learning landslide susceptibility mapping geohazards Peru |
title | An Ensemble Approach of Feature Selection and Machine Learning Models for Regional Landslide Susceptibility Mapping in the Arid Mountainous Terrain of Southern Peru |
title_full | An Ensemble Approach of Feature Selection and Machine Learning Models for Regional Landslide Susceptibility Mapping in the Arid Mountainous Terrain of Southern Peru |
title_fullStr | An Ensemble Approach of Feature Selection and Machine Learning Models for Regional Landslide Susceptibility Mapping in the Arid Mountainous Terrain of Southern Peru |
title_full_unstemmed | An Ensemble Approach of Feature Selection and Machine Learning Models for Regional Landslide Susceptibility Mapping in the Arid Mountainous Terrain of Southern Peru |
title_short | An Ensemble Approach of Feature Selection and Machine Learning Models for Regional Landslide Susceptibility Mapping in the Arid Mountainous Terrain of Southern Peru |
title_sort | ensemble approach of feature selection and machine learning models for regional landslide susceptibility mapping in the arid mountainous terrain of southern peru |
topic | ensemble feature selection ensemble machine learning landslide susceptibility mapping geohazards Peru |
url | https://www.mdpi.com/2072-4292/15/5/1376 |
work_keys_str_mv | AT chandankumar anensembleapproachoffeatureselectionandmachinelearningmodelsforregionallandslidesusceptibilitymappinginthearidmountainousterrainofsouthernperu AT gabrielwalton anensembleapproachoffeatureselectionandmachinelearningmodelsforregionallandslidesusceptibilitymappinginthearidmountainousterrainofsouthernperu AT paulsanti anensembleapproachoffeatureselectionandmachinelearningmodelsforregionallandslidesusceptibilitymappinginthearidmountainousterrainofsouthernperu AT carlosluza anensembleapproachoffeatureselectionandmachinelearningmodelsforregionallandslidesusceptibilitymappinginthearidmountainousterrainofsouthernperu AT chandankumar ensembleapproachoffeatureselectionandmachinelearningmodelsforregionallandslidesusceptibilitymappinginthearidmountainousterrainofsouthernperu AT gabrielwalton ensembleapproachoffeatureselectionandmachinelearningmodelsforregionallandslidesusceptibilitymappinginthearidmountainousterrainofsouthernperu AT paulsanti ensembleapproachoffeatureselectionandmachinelearningmodelsforregionallandslidesusceptibilitymappinginthearidmountainousterrainofsouthernperu AT carlosluza ensembleapproachoffeatureselectionandmachinelearningmodelsforregionallandslidesusceptibilitymappinginthearidmountainousterrainofsouthernperu |