Landslide susceptibility assessment of South Korea using stacking ensemble machine learning

Abstract Background Landslide susceptibility assessment (LSA) is a crucial indicator of landslide hazards, and its accuracy is improving with the development of artificial intelligence (AI) technology. However, the AI algorithms are inconsistent across regions and strongly dependent on input variabl...

Full description

Bibliographic Details
Main Authors: Seung-Min Lee, Seung-Jae Lee
Format: Article
Language:English
Published: SpringerOpen 2024-02-01
Series:Geoenvironmental Disasters
Subjects:
Online Access:https://doi.org/10.1186/s40677-024-00271-y
_version_ 1827326248471232512
author Seung-Min Lee
Seung-Jae Lee
author_facet Seung-Min Lee
Seung-Jae Lee
author_sort Seung-Min Lee
collection DOAJ
description Abstract Background Landslide susceptibility assessment (LSA) is a crucial indicator of landslide hazards, and its accuracy is improving with the development of artificial intelligence (AI) technology. However, the AI algorithms are inconsistent across regions and strongly dependent on input variables. Additionally, LSA must include historical data, which often restricts the assessment to the local scale and single landslide events. Methods In this study, we performed an LSA for the entirety of South Korea. A total of 30 input variables were constructed, consisting of 9 variables from past climate model data MK-PRISM, 12 topographical factors, and 9 environmental factors. Sixteen machine learning algorithms were used as basic classifiers, and a stacking ensemble was used on the four algorithms with the highest area under the curve (AUC). Additionally, a separate assessment model was established for areas with a risk of landslides affecting areas larger than 1 ha. Results The highest-performing classifier was CatBoost, with an AUC of ~ 0.89 for both assessments. Among the input variables, distance of road, daily maximum precipitation, digital elevation model, and soil depth were the most influential. In all landslide events, CatBoost, lightGBM, XGBoost, and Random Forest had the highest AUC in descending order; in large landslide events, the order was CatBoost, XGBoost, Extra Tree, and lightGBM. The stacking ensemble enabled the construction of two landslide susceptibility maps. Conclusions Our findings provide a statistical method for constructing a high-resolution (30 m) landslide susceptibility map on a country scale using diverse natural factors, including past climate data.
first_indexed 2024-03-07T14:41:47Z
format Article
id doaj.art-80635641e1af4db197bfd7eb8baee58e
institution Directory Open Access Journal
issn 2197-8670
language English
last_indexed 2024-03-07T14:41:47Z
publishDate 2024-02-01
publisher SpringerOpen
record_format Article
series Geoenvironmental Disasters
spelling doaj.art-80635641e1af4db197bfd7eb8baee58e2024-03-05T20:19:18ZengSpringerOpenGeoenvironmental Disasters2197-86702024-02-0111111710.1186/s40677-024-00271-yLandslide susceptibility assessment of South Korea using stacking ensemble machine learningSeung-Min Lee0Seung-Jae Lee1National Center for AgroMeteorology, Seoul National UniversityNational Center for AgroMeteorology, Seoul National UniversityAbstract Background Landslide susceptibility assessment (LSA) is a crucial indicator of landslide hazards, and its accuracy is improving with the development of artificial intelligence (AI) technology. However, the AI algorithms are inconsistent across regions and strongly dependent on input variables. Additionally, LSA must include historical data, which often restricts the assessment to the local scale and single landslide events. Methods In this study, we performed an LSA for the entirety of South Korea. A total of 30 input variables were constructed, consisting of 9 variables from past climate model data MK-PRISM, 12 topographical factors, and 9 environmental factors. Sixteen machine learning algorithms were used as basic classifiers, and a stacking ensemble was used on the four algorithms with the highest area under the curve (AUC). Additionally, a separate assessment model was established for areas with a risk of landslides affecting areas larger than 1 ha. Results The highest-performing classifier was CatBoost, with an AUC of ~ 0.89 for both assessments. Among the input variables, distance of road, daily maximum precipitation, digital elevation model, and soil depth were the most influential. In all landslide events, CatBoost, lightGBM, XGBoost, and Random Forest had the highest AUC in descending order; in large landslide events, the order was CatBoost, XGBoost, Extra Tree, and lightGBM. The stacking ensemble enabled the construction of two landslide susceptibility maps. Conclusions Our findings provide a statistical method for constructing a high-resolution (30 m) landslide susceptibility map on a country scale using diverse natural factors, including past climate data.https://doi.org/10.1186/s40677-024-00271-yLandslideSusceptibility modelStacking ensembleMachine learning
spellingShingle Seung-Min Lee
Seung-Jae Lee
Landslide susceptibility assessment of South Korea using stacking ensemble machine learning
Geoenvironmental Disasters
Landslide
Susceptibility model
Stacking ensemble
Machine learning
title Landslide susceptibility assessment of South Korea using stacking ensemble machine learning
title_full Landslide susceptibility assessment of South Korea using stacking ensemble machine learning
title_fullStr Landslide susceptibility assessment of South Korea using stacking ensemble machine learning
title_full_unstemmed Landslide susceptibility assessment of South Korea using stacking ensemble machine learning
title_short Landslide susceptibility assessment of South Korea using stacking ensemble machine learning
title_sort landslide susceptibility assessment of south korea using stacking ensemble machine learning
topic Landslide
Susceptibility model
Stacking ensemble
Machine learning
url https://doi.org/10.1186/s40677-024-00271-y
work_keys_str_mv AT seungminlee landslidesusceptibilityassessmentofsouthkoreausingstackingensemblemachinelearning
AT seungjaelee landslidesusceptibilityassessmentofsouthkoreausingstackingensemblemachinelearning