A Comparative Assessment of Machine Learning Models for Landslide Susceptibility Mapping in the Rugged Terrain of Northern Pakistan
This study investigated the performances of different techniques, including random forest (RF), support vector machine (SVM), maximum entropy (maxENT), gradient-boosting machine (GBM), and logistic regression (LR), for landslide susceptibility mapping (LSM) in the rugged terrain of northern Pakistan...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-02-01
|
Series: | Applied Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-3417/12/5/2280 |
_version_ | 1827651988035207168 |
---|---|
author | Naeem Shahzad Xiaoli Ding Sawaid Abbas |
author_facet | Naeem Shahzad Xiaoli Ding Sawaid Abbas |
author_sort | Naeem Shahzad |
collection | DOAJ |
description | This study investigated the performances of different techniques, including random forest (RF), support vector machine (SVM), maximum entropy (maxENT), gradient-boosting machine (GBM), and logistic regression (LR), for landslide susceptibility mapping (LSM) in the rugged terrain of northern Pakistan. Initially, a landslide inventory of 200 samples was produced along with an additional 200 samples indicating nonlandslide areas and divided into training (70%) and validation (30%) groups using a stratified loop-based random sampling approach. Then, a geospatial database of 12 possible landslide influencing factors (LIFs) was generated, including elevation, slope, aspect, topographic wetness index (TWI), topographic position index (TPI), distance to drainage, distance to fault, distance to road, normalized difference vegetation index (NDVI), rainfall, land cover/land use (LCLU), and a geological map of the study area. None of the LIFs were redundant for the modeling, as indicated by the multicollinearity test (tolerance > 0.1) and information gain ratio (IGR > 0). We extended the evaluation measures of each algorithm from area-under-the-curve (AUC) analysis to the calculation of performance overall (POA) with the help of precision, recall, F1 score, accuracy (ACC), and Matthew’s correlation coefficient (MCC). The results showed that the SVM was the most promising model (AUC = 0.969, POA = 2669) for the LSM, followed by RF (AUC = 0.967, POA = 2656), GBM (AUC = 0.967, POA = 2623), maxENT (AUC = 0.872, POA = 1761), and LR (AUC = 0.836, POA = 1299). It is important to note that the SVM, RF, and GBM were the top performers, with almost similar accuracy. Thus, each of these could be equally effective for LSM and can be used for risk reduction and mitigation measures in the rugged terrain of Pakistan and other regions with similar topography. |
first_indexed | 2024-03-09T20:49:11Z |
format | Article |
id | doaj.art-067143dae67544baac935c173d0c6dfe |
institution | Directory Open Access Journal |
issn | 2076-3417 |
language | English |
last_indexed | 2024-03-09T20:49:11Z |
publishDate | 2022-02-01 |
publisher | MDPI AG |
record_format | Article |
series | Applied Sciences |
spelling | doaj.art-067143dae67544baac935c173d0c6dfe2023-11-23T22:38:19ZengMDPI AGApplied Sciences2076-34172022-02-01125228010.3390/app12052280A Comparative Assessment of Machine Learning Models for Landslide Susceptibility Mapping in the Rugged Terrain of Northern PakistanNaeem Shahzad0Xiaoli Ding1Sawaid Abbas2Department of Land Surveying and Geo-Informatics, The Hong Kong Polytechnic University, Hong Kong SAR, ChinaDepartment of Land Surveying and Geo-Informatics, The Hong Kong Polytechnic University, Hong Kong SAR, ChinaDepartment of Land Surveying and Geo-Informatics, The Hong Kong Polytechnic University, Hong Kong SAR, ChinaThis study investigated the performances of different techniques, including random forest (RF), support vector machine (SVM), maximum entropy (maxENT), gradient-boosting machine (GBM), and logistic regression (LR), for landslide susceptibility mapping (LSM) in the rugged terrain of northern Pakistan. Initially, a landslide inventory of 200 samples was produced along with an additional 200 samples indicating nonlandslide areas and divided into training (70%) and validation (30%) groups using a stratified loop-based random sampling approach. Then, a geospatial database of 12 possible landslide influencing factors (LIFs) was generated, including elevation, slope, aspect, topographic wetness index (TWI), topographic position index (TPI), distance to drainage, distance to fault, distance to road, normalized difference vegetation index (NDVI), rainfall, land cover/land use (LCLU), and a geological map of the study area. None of the LIFs were redundant for the modeling, as indicated by the multicollinearity test (tolerance > 0.1) and information gain ratio (IGR > 0). We extended the evaluation measures of each algorithm from area-under-the-curve (AUC) analysis to the calculation of performance overall (POA) with the help of precision, recall, F1 score, accuracy (ACC), and Matthew’s correlation coefficient (MCC). The results showed that the SVM was the most promising model (AUC = 0.969, POA = 2669) for the LSM, followed by RF (AUC = 0.967, POA = 2656), GBM (AUC = 0.967, POA = 2623), maxENT (AUC = 0.872, POA = 1761), and LR (AUC = 0.836, POA = 1299). It is important to note that the SVM, RF, and GBM were the top performers, with almost similar accuracy. Thus, each of these could be equally effective for LSM and can be used for risk reduction and mitigation measures in the rugged terrain of Pakistan and other regions with similar topography.https://www.mdpi.com/2076-3417/12/5/2280machine learninglandslide susceptibilityrandom forestsupport vector machinegradient-boosting machinemaximum entropy |
spellingShingle | Naeem Shahzad Xiaoli Ding Sawaid Abbas A Comparative Assessment of Machine Learning Models for Landslide Susceptibility Mapping in the Rugged Terrain of Northern Pakistan Applied Sciences machine learning landslide susceptibility random forest support vector machine gradient-boosting machine maximum entropy |
title | A Comparative Assessment of Machine Learning Models for Landslide Susceptibility Mapping in the Rugged Terrain of Northern Pakistan |
title_full | A Comparative Assessment of Machine Learning Models for Landslide Susceptibility Mapping in the Rugged Terrain of Northern Pakistan |
title_fullStr | A Comparative Assessment of Machine Learning Models for Landslide Susceptibility Mapping in the Rugged Terrain of Northern Pakistan |
title_full_unstemmed | A Comparative Assessment of Machine Learning Models for Landslide Susceptibility Mapping in the Rugged Terrain of Northern Pakistan |
title_short | A Comparative Assessment of Machine Learning Models for Landslide Susceptibility Mapping in the Rugged Terrain of Northern Pakistan |
title_sort | comparative assessment of machine learning models for landslide susceptibility mapping in the rugged terrain of northern pakistan |
topic | machine learning landslide susceptibility random forest support vector machine gradient-boosting machine maximum entropy |
url | https://www.mdpi.com/2076-3417/12/5/2280 |
work_keys_str_mv | AT naeemshahzad acomparativeassessmentofmachinelearningmodelsforlandslidesusceptibilitymappingintheruggedterrainofnorthernpakistan AT xiaoliding acomparativeassessmentofmachinelearningmodelsforlandslidesusceptibilitymappingintheruggedterrainofnorthernpakistan AT sawaidabbas acomparativeassessmentofmachinelearningmodelsforlandslidesusceptibilitymappingintheruggedterrainofnorthernpakistan AT naeemshahzad comparativeassessmentofmachinelearningmodelsforlandslidesusceptibilitymappingintheruggedterrainofnorthernpakistan AT xiaoliding comparativeassessmentofmachinelearningmodelsforlandslidesusceptibilitymappingintheruggedterrainofnorthernpakistan AT sawaidabbas comparativeassessmentofmachinelearningmodelsforlandslidesusceptibilitymappingintheruggedterrainofnorthernpakistan |