A hybrid of ensemble machine learning models with RFE and Boruta wrapper-based algorithms for flash flood susceptibility assessment

Flash floods are among the world most destructive natural disasters, and developing optimum hybrid Machine Learning (ML) models for flash flood susceptibility (FFS) modeling remains a challenge. This study proposed novel intelligence algorithms based on a hybrid of several ensemble ML models (i.e.,...

Full description

Bibliographic Details
Main Authors: Alireza Habibi, Mahmoud Reza Delavar, Mohammad Sadegh Sadeghian, Borzoo Nazari, Saeid Pirasteh
Format: Article
Language:English
Published: Elsevier 2023-08-01
Series:International Journal of Applied Earth Observations and Geoinformation
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S156984322300225X
Description
Summary:Flash floods are among the world most destructive natural disasters, and developing optimum hybrid Machine Learning (ML) models for flash flood susceptibility (FFS) modeling remains a challenge. This study proposed novel intelligence algorithms based on a hybrid of several ensemble ML models (i.e., Bagged Flexible Discriminant Analysis (BAFDA), Extreme Gradient Boosting (XBG), Rotation Forest (ROF) and Boosted Generalized Additive Model (BGAM)) and wrapper-based factor optimization algorithms (i.e., Recursive Feature Elimination (RFE) and Boruta) to improve the accuracy of FFS mapping at Neka-Haraz watershed in Iran. In addition, the Random Search (RS) method is proposed for meta-optimization of the developed hybrid models hyper-parameters. This study considers 20 flash flood conditioning factors (CgFs) and 380 flood and non-flood locations to create a geospatial database. The performance of each hybrid model was evaluated by area under the receiver operating characteristic (ROC) curve (AUC) and several validation methods, such as efficiency. The developed hybrid models demonstrated good performance, with BGAM-Boruta achieving the highest performance (AUC = 0.953, and Efficiency = 0.910), followed by ROF-Boruta (AUC = 0.952), ROF-RFE (AUC = 0.951), BAFDA-Boruta (AUC = 0.950), BGAM-RFE (AUC = 0.950), ROF (AUC = 0.949), BGAM (AUC = 0.948), BAFDA-RFE (AUC = 0.943), XGB-Boruta (AUC = 0.943), BAFDA (AUC = 0.939), XGB-RFE (AUC = 0.938) and XGB (AUC = 0.911). In the BGAM-Boruta model, the regional coverage was about 46% for high to very high FFS areas. Moreover, the study revealed that distance to river, slope, rainfall, altitude, and distance to road CgFs are the most significant for FFS modeling in this region.
ISSN:1569-8432