GBDT-IL: Incremental Learning of Gradient Boosting Decision Trees to Detect Botnets in Internet of Things

The rapid development of the Internet of Things (IoT) has brought many conveniences to our daily life. However, it has also introduced various security risks that need to be addressed. The proliferation of IoT botnets is one of these risks. Most of researchers have had some success in IoT botnet det...

Full description

Bibliographic Details
Main Authors: Ruidong Chen, Tianci Dai, Yanfeng Zhang, Yukun Zhu, Xin Liu, Erfan Zhao
Format: Article
Language:English
Published: MDPI AG 2024-03-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/24/7/2083
_version_ 1827286557408624640
author Ruidong Chen
Tianci Dai
Yanfeng Zhang
Yukun Zhu
Xin Liu
Erfan Zhao
author_facet Ruidong Chen
Tianci Dai
Yanfeng Zhang
Yukun Zhu
Xin Liu
Erfan Zhao
author_sort Ruidong Chen
collection DOAJ
description The rapid development of the Internet of Things (IoT) has brought many conveniences to our daily life. However, it has also introduced various security risks that need to be addressed. The proliferation of IoT botnets is one of these risks. Most of researchers have had some success in IoT botnet detection using artificial intelligence (AI). However, they have not considered the impact of dynamic network data streams on the models in real-world environments. Over time, existing detection models struggle to cope with evolving botnets. To address this challenge, we propose an incremental learning approach based on Gradient Boosting Decision Trees (GBDT), called GBDT-IL, for detecting botnet traffic in IoT environments. It improves the robustness of the framework by adapting to dynamic IoT data using incremental learning. Additionally, it incorporates an enhanced Fisher Score feature selection algorithm, which enables the model to achieve a high accuracy even with a smaller set of optimal features, thereby reducing the system resources required for model training. To evaluate the effectiveness of our approach, we conducted experiments on the BoT-IoT, N-BaIoT, MedBIoT, and MQTTSet datasets. We compared our method with similar feature selection algorithms and existing concept drift detection algorithms. The experimental results demonstrated that our method achieved an average accuracy of 99.81% using only 25 features, outperforming similar feature selection algorithms. Furthermore, our method achieved an average accuracy of 96.88% in the presence of different types of drifting data, which is 2.98% higher than the best available concept drift detection algorithms, while maintaining a low average false positive rate of 3.02%.
first_indexed 2024-04-24T10:35:27Z
format Article
id doaj.art-3e9d81095e9244a3a6f41e595fb81fcd
institution Directory Open Access Journal
issn 1424-8220
language English
last_indexed 2024-04-24T10:35:27Z
publishDate 2024-03-01
publisher MDPI AG
record_format Article
series Sensors
spelling doaj.art-3e9d81095e9244a3a6f41e595fb81fcd2024-04-12T13:26:10ZengMDPI AGSensors1424-82202024-03-01247208310.3390/s24072083GBDT-IL: Incremental Learning of Gradient Boosting Decision Trees to Detect Botnets in Internet of ThingsRuidong Chen0Tianci Dai1Yanfeng Zhang2Yukun Zhu3Xin Liu4Erfan Zhao5Institute for Cyber Security, School of Computer Science and Engineering, University of Electronic Science and Technology of China (UESTC), Chengdu 611731, ChinaInstitute for Cyber Security, School of Computer Science and Engineering, University of Electronic Science and Technology of China (UESTC), Chengdu 611731, ChinaIntelligent Policing Key Laboratory of Sichuan Province, Sichuan Police College, Luzhou 646000, ChinaInstitute for Cyber Security, School of Computer Science and Engineering, University of Electronic Science and Technology of China (UESTC), Chengdu 611731, ChinaSchool of Information Science and Engineering, Lanzhou University, Lanzhou 730000, ChinaInstitute for Cyber Security, School of Computer Science and Engineering, University of Electronic Science and Technology of China (UESTC), Chengdu 611731, ChinaThe rapid development of the Internet of Things (IoT) has brought many conveniences to our daily life. However, it has also introduced various security risks that need to be addressed. The proliferation of IoT botnets is one of these risks. Most of researchers have had some success in IoT botnet detection using artificial intelligence (AI). However, they have not considered the impact of dynamic network data streams on the models in real-world environments. Over time, existing detection models struggle to cope with evolving botnets. To address this challenge, we propose an incremental learning approach based on Gradient Boosting Decision Trees (GBDT), called GBDT-IL, for detecting botnet traffic in IoT environments. It improves the robustness of the framework by adapting to dynamic IoT data using incremental learning. Additionally, it incorporates an enhanced Fisher Score feature selection algorithm, which enables the model to achieve a high accuracy even with a smaller set of optimal features, thereby reducing the system resources required for model training. To evaluate the effectiveness of our approach, we conducted experiments on the BoT-IoT, N-BaIoT, MedBIoT, and MQTTSet datasets. We compared our method with similar feature selection algorithms and existing concept drift detection algorithms. The experimental results demonstrated that our method achieved an average accuracy of 99.81% using only 25 features, outperforming similar feature selection algorithms. Furthermore, our method achieved an average accuracy of 96.88% in the presence of different types of drifting data, which is 2.98% higher than the best available concept drift detection algorithms, while maintaining a low average false positive rate of 3.02%.https://www.mdpi.com/1424-8220/24/7/2083botnetsinternet of thingsfeature dimensionality reductionconcept drift
spellingShingle Ruidong Chen
Tianci Dai
Yanfeng Zhang
Yukun Zhu
Xin Liu
Erfan Zhao
GBDT-IL: Incremental Learning of Gradient Boosting Decision Trees to Detect Botnets in Internet of Things
Sensors
botnets
internet of things
feature dimensionality reduction
concept drift
title GBDT-IL: Incremental Learning of Gradient Boosting Decision Trees to Detect Botnets in Internet of Things
title_full GBDT-IL: Incremental Learning of Gradient Boosting Decision Trees to Detect Botnets in Internet of Things
title_fullStr GBDT-IL: Incremental Learning of Gradient Boosting Decision Trees to Detect Botnets in Internet of Things
title_full_unstemmed GBDT-IL: Incremental Learning of Gradient Boosting Decision Trees to Detect Botnets in Internet of Things
title_short GBDT-IL: Incremental Learning of Gradient Boosting Decision Trees to Detect Botnets in Internet of Things
title_sort gbdt il incremental learning of gradient boosting decision trees to detect botnets in internet of things
topic botnets
internet of things
feature dimensionality reduction
concept drift
url https://www.mdpi.com/1424-8220/24/7/2083
work_keys_str_mv AT ruidongchen gbdtilincrementallearningofgradientboostingdecisiontreestodetectbotnetsininternetofthings
AT tiancidai gbdtilincrementallearningofgradientboostingdecisiontreestodetectbotnetsininternetofthings
AT yanfengzhang gbdtilincrementallearningofgradientboostingdecisiontreestodetectbotnetsininternetofthings
AT yukunzhu gbdtilincrementallearningofgradientboostingdecisiontreestodetectbotnetsininternetofthings
AT xinliu gbdtilincrementallearningofgradientboostingdecisiontreestodetectbotnetsininternetofthings
AT erfanzhao gbdtilincrementallearningofgradientboostingdecisiontreestodetectbotnetsininternetofthings