Landslide susceptibility analysis using random forest model with SMOTE-ENN resampling algorithm

Landslide is one of the natural disasters that cause property damages and human injuries. Landslide hazard predictions are crucial measures to reduce the damages and losses. One of the effective approaches in landslide prediction is landslide susceptibility analysis (LSA). In this article, LSA is ca...

Full description

Bibliographic Details
Main Authors: Mingxi Lu, Lea Tien Tay, Junita Mohamad-Saleh
Format: Article
Language:English
Published: Taylor & Francis Group 2024-12-01
Series:Geomatics, Natural Hazards & Risk
Subjects:
Online Access:https://www.tandfonline.com/doi/10.1080/19475705.2024.2314565
_version_ 1826936623028240384
author Mingxi Lu
Lea Tien Tay
Junita Mohamad-Saleh
author_facet Mingxi Lu
Lea Tien Tay
Junita Mohamad-Saleh
author_sort Mingxi Lu
collection DOAJ
description Landslide is one of the natural disasters that cause property damages and human injuries. Landslide hazard predictions are crucial measures to reduce the damages and losses. One of the effective approaches in landslide prediction is landslide susceptibility analysis (LSA). In this article, LSA is carried out on the study area, Penang Island. The imbalanced landslide dataset is the most important issue to be solved in this article, four resampling methods were compared for the training set using random forest (RF) as the basic model. To enhance the credibility of the results, the experiments replicate 10 times, and McNemar’s test was applied to analyse statistical significance of classifier performances for the LSA. The results indicated that the differences between the methods were statistically significant; RF combined with the synthetic minority oversampling technique-edited nearest neighbour (SMOTE-ENN) resampling method proposed in this paper has positive effect in LSA as compared with the other resampling methods. The RF and SMOTE-ENN combined model for the LSA using the min–max normalization method achieved a recall of 0.844 and an F2-score of 0.756. The SMOTE-ENN method had a significant impact on the LSA of the imbalanced data in the study area.
first_indexed 2024-03-07T23:19:24Z
format Article
id doaj.art-df444965d0eb4900ae9034037b360b3f
institution Directory Open Access Journal
issn 1947-5705
1947-5713
language English
last_indexed 2025-02-17T18:25:43Z
publishDate 2024-12-01
publisher Taylor & Francis Group
record_format Article
series Geomatics, Natural Hazards & Risk
spelling doaj.art-df444965d0eb4900ae9034037b360b3f2024-12-12T18:11:18ZengTaylor & Francis GroupGeomatics, Natural Hazards & Risk1947-57051947-57132024-12-0115110.1080/19475705.2024.2314565Landslide susceptibility analysis using random forest model with SMOTE-ENN resampling algorithmMingxi Lu0Lea Tien Tay1Junita Mohamad-Saleh2School of Electrical and Electronic Engineering, Universiti Sains Malaysia, Engineering Campus, Nibong Tebal, MalaysiaSchool of Electrical and Electronic Engineering, Universiti Sains Malaysia, Engineering Campus, Nibong Tebal, MalaysiaSchool of Electrical and Electronic Engineering, Universiti Sains Malaysia, Engineering Campus, Nibong Tebal, MalaysiaLandslide is one of the natural disasters that cause property damages and human injuries. Landslide hazard predictions are crucial measures to reduce the damages and losses. One of the effective approaches in landslide prediction is landslide susceptibility analysis (LSA). In this article, LSA is carried out on the study area, Penang Island. The imbalanced landslide dataset is the most important issue to be solved in this article, four resampling methods were compared for the training set using random forest (RF) as the basic model. To enhance the credibility of the results, the experiments replicate 10 times, and McNemar’s test was applied to analyse statistical significance of classifier performances for the LSA. The results indicated that the differences between the methods were statistically significant; RF combined with the synthetic minority oversampling technique-edited nearest neighbour (SMOTE-ENN) resampling method proposed in this paper has positive effect in LSA as compared with the other resampling methods. The RF and SMOTE-ENN combined model for the LSA using the min–max normalization method achieved a recall of 0.844 and an F2-score of 0.756. The SMOTE-ENN method had a significant impact on the LSA of the imbalanced data in the study area.https://www.tandfonline.com/doi/10.1080/19475705.2024.2314565LandslideSMOTE-ENNrandom forestsusceptibility
spellingShingle Mingxi Lu
Lea Tien Tay
Junita Mohamad-Saleh
Landslide susceptibility analysis using random forest model with SMOTE-ENN resampling algorithm
Geomatics, Natural Hazards & Risk
Landslide
SMOTE-ENN
random forest
susceptibility
title Landslide susceptibility analysis using random forest model with SMOTE-ENN resampling algorithm
title_full Landslide susceptibility analysis using random forest model with SMOTE-ENN resampling algorithm
title_fullStr Landslide susceptibility analysis using random forest model with SMOTE-ENN resampling algorithm
title_full_unstemmed Landslide susceptibility analysis using random forest model with SMOTE-ENN resampling algorithm
title_short Landslide susceptibility analysis using random forest model with SMOTE-ENN resampling algorithm
title_sort landslide susceptibility analysis using random forest model with smote enn resampling algorithm
topic Landslide
SMOTE-ENN
random forest
susceptibility
url https://www.tandfonline.com/doi/10.1080/19475705.2024.2314565
work_keys_str_mv AT mingxilu landslidesusceptibilityanalysisusingrandomforestmodelwithsmoteennresamplingalgorithm
AT leatientay landslidesusceptibilityanalysisusingrandomforestmodelwithsmoteennresamplingalgorithm
AT junitamohamadsaleh landslidesusceptibilityanalysisusingrandomforestmodelwithsmoteennresamplingalgorithm