ADASYN-LOF Algorithm for Imbalanced Tornado Samples

Early warning and forecasting of tornadoes began to combine artificial intelligence (AI) and machine learning (ML) algorithms to improve identification efficiency in the past few years. Applying machine learning algorithms to detect tornadoes usually encounters class imbalance problems because torna...

Full description

Bibliographic Details
Main Authors: Zhipeng Qing, Qiangyu Zeng, Hao Wang, Yin Liu, Taisong Xiong, Shihao Zhang
Format: Article
Language:English
Published: MDPI AG 2022-03-01
Series:Atmosphere
Subjects:
Online Access:https://www.mdpi.com/2073-4433/13/4/544
_version_ 1827621760267190272
author Zhipeng Qing
Qiangyu Zeng
Hao Wang
Yin Liu
Taisong Xiong
Shihao Zhang
author_facet Zhipeng Qing
Qiangyu Zeng
Hao Wang
Yin Liu
Taisong Xiong
Shihao Zhang
author_sort Zhipeng Qing
collection DOAJ
description Early warning and forecasting of tornadoes began to combine artificial intelligence (AI) and machine learning (ML) algorithms to improve identification efficiency in the past few years. Applying machine learning algorithms to detect tornadoes usually encounters class imbalance problems because tornadoes are rare events in weather processes. The ADASYN-LOF algorithm (ALA) was proposed to solve the imbalance problem of tornado sample sets based on radar data. The adaptive synthetic (ADASYN) sampling algorithm is used to solve the imbalance problem by increasing the number of minority class samples, combined with the local outlier factor (LOF) algorithm to denoise the synthetic samples. The performance of the ALA algorithm is tested by using the supporting vector machine (SVM), artificial neural network (ANN), and random forest (RF) models. The results show that the ALA algorithm can improve the performance and noise immunity of the models, significantly increase the tornado recognition rate, and have the potential to increase the early tornado warning time. ALA is more effective in preprocessing imbalanced data of SVM and ANN, compared with ADASYN, Synthetic Minority Oversampling Technique (SMOTE), SMOTE-LOF algorithms.
first_indexed 2024-03-09T11:09:51Z
format Article
id doaj.art-30c31e4bcc4e428886e745e9317be2ac
institution Directory Open Access Journal
issn 2073-4433
language English
last_indexed 2024-03-09T11:09:51Z
publishDate 2022-03-01
publisher MDPI AG
record_format Article
series Atmosphere
spelling doaj.art-30c31e4bcc4e428886e745e9317be2ac2023-12-01T00:46:22ZengMDPI AGAtmosphere2073-44332022-03-0113454410.3390/atmos13040544ADASYN-LOF Algorithm for Imbalanced Tornado SamplesZhipeng Qing0Qiangyu Zeng1Hao Wang2Yin Liu3Taisong Xiong4Shihao Zhang5CMA Key Laboratory of Atmospheric Sounding, Chengdu 610225, ChinaCMA Key Laboratory of Atmospheric Sounding, Chengdu 610225, ChinaCMA Key Laboratory of Atmospheric Sounding, Chengdu 610225, ChinaJiangsu Meteorological Observation Center, Nanjing 210041, ChinaCMA Key Laboratory of Atmospheric Sounding, Chengdu 610225, ChinaCMA Key Laboratory of Atmospheric Sounding, Chengdu 610225, ChinaEarly warning and forecasting of tornadoes began to combine artificial intelligence (AI) and machine learning (ML) algorithms to improve identification efficiency in the past few years. Applying machine learning algorithms to detect tornadoes usually encounters class imbalance problems because tornadoes are rare events in weather processes. The ADASYN-LOF algorithm (ALA) was proposed to solve the imbalance problem of tornado sample sets based on radar data. The adaptive synthetic (ADASYN) sampling algorithm is used to solve the imbalance problem by increasing the number of minority class samples, combined with the local outlier factor (LOF) algorithm to denoise the synthetic samples. The performance of the ALA algorithm is tested by using the supporting vector machine (SVM), artificial neural network (ANN), and random forest (RF) models. The results show that the ALA algorithm can improve the performance and noise immunity of the models, significantly increase the tornado recognition rate, and have the potential to increase the early tornado warning time. ALA is more effective in preprocessing imbalanced data of SVM and ANN, compared with ADASYN, Synthetic Minority Oversampling Technique (SMOTE), SMOTE-LOF algorithms.https://www.mdpi.com/2073-4433/13/4/544tornadoesclass imbalancemachine learning
spellingShingle Zhipeng Qing
Qiangyu Zeng
Hao Wang
Yin Liu
Taisong Xiong
Shihao Zhang
ADASYN-LOF Algorithm for Imbalanced Tornado Samples
Atmosphere
tornadoes
class imbalance
machine learning
title ADASYN-LOF Algorithm for Imbalanced Tornado Samples
title_full ADASYN-LOF Algorithm for Imbalanced Tornado Samples
title_fullStr ADASYN-LOF Algorithm for Imbalanced Tornado Samples
title_full_unstemmed ADASYN-LOF Algorithm for Imbalanced Tornado Samples
title_short ADASYN-LOF Algorithm for Imbalanced Tornado Samples
title_sort adasyn lof algorithm for imbalanced tornado samples
topic tornadoes
class imbalance
machine learning
url https://www.mdpi.com/2073-4433/13/4/544
work_keys_str_mv AT zhipengqing adasynlofalgorithmforimbalancedtornadosamples
AT qiangyuzeng adasynlofalgorithmforimbalancedtornadosamples
AT haowang adasynlofalgorithmforimbalancedtornadosamples
AT yinliu adasynlofalgorithmforimbalancedtornadosamples
AT taisongxiong adasynlofalgorithmforimbalancedtornadosamples
AT shihaozhang adasynlofalgorithmforimbalancedtornadosamples