A Comparative Assessment of Machine Learning Models for Landslide Susceptibility Mapping in the Rugged Terrain of Northern Pakistan

This study investigated the performances of different techniques, including random forest (RF), support vector machine (SVM), maximum entropy (maxENT), gradient-boosting machine (GBM), and logistic regression (LR), for landslide susceptibility mapping (LSM) in the rugged terrain of northern Pakistan...

Full description

Bibliographic Details
Main Authors: Naeem Shahzad, Xiaoli Ding, Sawaid Abbas
Format: Article
Language:English
Published: MDPI AG 2022-02-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/12/5/2280
_version_ 1827651988035207168
author Naeem Shahzad
Xiaoli Ding
Sawaid Abbas
author_facet Naeem Shahzad
Xiaoli Ding
Sawaid Abbas
author_sort Naeem Shahzad
collection DOAJ
description This study investigated the performances of different techniques, including random forest (RF), support vector machine (SVM), maximum entropy (maxENT), gradient-boosting machine (GBM), and logistic regression (LR), for landslide susceptibility mapping (LSM) in the rugged terrain of northern Pakistan. Initially, a landslide inventory of 200 samples was produced along with an additional 200 samples indicating nonlandslide areas and divided into training (70%) and validation (30%) groups using a stratified loop-based random sampling approach. Then, a geospatial database of 12 possible landslide influencing factors (LIFs) was generated, including elevation, slope, aspect, topographic wetness index (TWI), topographic position index (TPI), distance to drainage, distance to fault, distance to road, normalized difference vegetation index (NDVI), rainfall, land cover/land use (LCLU), and a geological map of the study area. None of the LIFs were redundant for the modeling, as indicated by the multicollinearity test (tolerance > 0.1) and information gain ratio (IGR > 0). We extended the evaluation measures of each algorithm from area-under-the-curve (AUC) analysis to the calculation of performance overall (POA) with the help of precision, recall, F1 score, accuracy (ACC), and Matthew’s correlation coefficient (MCC). The results showed that the SVM was the most promising model (AUC = 0.969, POA = 2669) for the LSM, followed by RF (AUC = 0.967, POA = 2656), GBM (AUC = 0.967, POA = 2623), maxENT (AUC = 0.872, POA = 1761), and LR (AUC = 0.836, POA = 1299). It is important to note that the SVM, RF, and GBM were the top performers, with almost similar accuracy. Thus, each of these could be equally effective for LSM and can be used for risk reduction and mitigation measures in the rugged terrain of Pakistan and other regions with similar topography.
first_indexed 2024-03-09T20:49:11Z
format Article
id doaj.art-067143dae67544baac935c173d0c6dfe
institution Directory Open Access Journal
issn 2076-3417
language English
last_indexed 2024-03-09T20:49:11Z
publishDate 2022-02-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj.art-067143dae67544baac935c173d0c6dfe2023-11-23T22:38:19ZengMDPI AGApplied Sciences2076-34172022-02-01125228010.3390/app12052280A Comparative Assessment of Machine Learning Models for Landslide Susceptibility Mapping in the Rugged Terrain of Northern PakistanNaeem Shahzad0Xiaoli Ding1Sawaid Abbas2Department of Land Surveying and Geo-Informatics, The Hong Kong Polytechnic University, Hong Kong SAR, ChinaDepartment of Land Surveying and Geo-Informatics, The Hong Kong Polytechnic University, Hong Kong SAR, ChinaDepartment of Land Surveying and Geo-Informatics, The Hong Kong Polytechnic University, Hong Kong SAR, ChinaThis study investigated the performances of different techniques, including random forest (RF), support vector machine (SVM), maximum entropy (maxENT), gradient-boosting machine (GBM), and logistic regression (LR), for landslide susceptibility mapping (LSM) in the rugged terrain of northern Pakistan. Initially, a landslide inventory of 200 samples was produced along with an additional 200 samples indicating nonlandslide areas and divided into training (70%) and validation (30%) groups using a stratified loop-based random sampling approach. Then, a geospatial database of 12 possible landslide influencing factors (LIFs) was generated, including elevation, slope, aspect, topographic wetness index (TWI), topographic position index (TPI), distance to drainage, distance to fault, distance to road, normalized difference vegetation index (NDVI), rainfall, land cover/land use (LCLU), and a geological map of the study area. None of the LIFs were redundant for the modeling, as indicated by the multicollinearity test (tolerance > 0.1) and information gain ratio (IGR > 0). We extended the evaluation measures of each algorithm from area-under-the-curve (AUC) analysis to the calculation of performance overall (POA) with the help of precision, recall, F1 score, accuracy (ACC), and Matthew’s correlation coefficient (MCC). The results showed that the SVM was the most promising model (AUC = 0.969, POA = 2669) for the LSM, followed by RF (AUC = 0.967, POA = 2656), GBM (AUC = 0.967, POA = 2623), maxENT (AUC = 0.872, POA = 1761), and LR (AUC = 0.836, POA = 1299). It is important to note that the SVM, RF, and GBM were the top performers, with almost similar accuracy. Thus, each of these could be equally effective for LSM and can be used for risk reduction and mitigation measures in the rugged terrain of Pakistan and other regions with similar topography.https://www.mdpi.com/2076-3417/12/5/2280machine learninglandslide susceptibilityrandom forestsupport vector machinegradient-boosting machinemaximum entropy
spellingShingle Naeem Shahzad
Xiaoli Ding
Sawaid Abbas
A Comparative Assessment of Machine Learning Models for Landslide Susceptibility Mapping in the Rugged Terrain of Northern Pakistan
Applied Sciences
machine learning
landslide susceptibility
random forest
support vector machine
gradient-boosting machine
maximum entropy
title A Comparative Assessment of Machine Learning Models for Landslide Susceptibility Mapping in the Rugged Terrain of Northern Pakistan
title_full A Comparative Assessment of Machine Learning Models for Landslide Susceptibility Mapping in the Rugged Terrain of Northern Pakistan
title_fullStr A Comparative Assessment of Machine Learning Models for Landslide Susceptibility Mapping in the Rugged Terrain of Northern Pakistan
title_full_unstemmed A Comparative Assessment of Machine Learning Models for Landslide Susceptibility Mapping in the Rugged Terrain of Northern Pakistan
title_short A Comparative Assessment of Machine Learning Models for Landslide Susceptibility Mapping in the Rugged Terrain of Northern Pakistan
title_sort comparative assessment of machine learning models for landslide susceptibility mapping in the rugged terrain of northern pakistan
topic machine learning
landslide susceptibility
random forest
support vector machine
gradient-boosting machine
maximum entropy
url https://www.mdpi.com/2076-3417/12/5/2280
work_keys_str_mv AT naeemshahzad acomparativeassessmentofmachinelearningmodelsforlandslidesusceptibilitymappingintheruggedterrainofnorthernpakistan
AT xiaoliding acomparativeassessmentofmachinelearningmodelsforlandslidesusceptibilitymappingintheruggedterrainofnorthernpakistan
AT sawaidabbas acomparativeassessmentofmachinelearningmodelsforlandslidesusceptibilitymappingintheruggedterrainofnorthernpakistan
AT naeemshahzad comparativeassessmentofmachinelearningmodelsforlandslidesusceptibilitymappingintheruggedterrainofnorthernpakistan
AT xiaoliding comparativeassessmentofmachinelearningmodelsforlandslidesusceptibilitymappingintheruggedterrainofnorthernpakistan
AT sawaidabbas comparativeassessmentofmachinelearningmodelsforlandslidesusceptibilitymappingintheruggedterrainofnorthernpakistan