GBDT-IL: Incremental Learning of Gradient Boosting Decision Trees to Detect Botnets in Internet of Things
The rapid development of the Internet of Things (IoT) has brought many conveniences to our daily life. However, it has also introduced various security risks that need to be addressed. The proliferation of IoT botnets is one of these risks. Most of researchers have had some success in IoT botnet det...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2024-03-01
|
Series: | Sensors |
Subjects: | |
Online Access: | https://www.mdpi.com/1424-8220/24/7/2083 |
_version_ | 1827286557408624640 |
---|---|
author | Ruidong Chen Tianci Dai Yanfeng Zhang Yukun Zhu Xin Liu Erfan Zhao |
author_facet | Ruidong Chen Tianci Dai Yanfeng Zhang Yukun Zhu Xin Liu Erfan Zhao |
author_sort | Ruidong Chen |
collection | DOAJ |
description | The rapid development of the Internet of Things (IoT) has brought many conveniences to our daily life. However, it has also introduced various security risks that need to be addressed. The proliferation of IoT botnets is one of these risks. Most of researchers have had some success in IoT botnet detection using artificial intelligence (AI). However, they have not considered the impact of dynamic network data streams on the models in real-world environments. Over time, existing detection models struggle to cope with evolving botnets. To address this challenge, we propose an incremental learning approach based on Gradient Boosting Decision Trees (GBDT), called GBDT-IL, for detecting botnet traffic in IoT environments. It improves the robustness of the framework by adapting to dynamic IoT data using incremental learning. Additionally, it incorporates an enhanced Fisher Score feature selection algorithm, which enables the model to achieve a high accuracy even with a smaller set of optimal features, thereby reducing the system resources required for model training. To evaluate the effectiveness of our approach, we conducted experiments on the BoT-IoT, N-BaIoT, MedBIoT, and MQTTSet datasets. We compared our method with similar feature selection algorithms and existing concept drift detection algorithms. The experimental results demonstrated that our method achieved an average accuracy of 99.81% using only 25 features, outperforming similar feature selection algorithms. Furthermore, our method achieved an average accuracy of 96.88% in the presence of different types of drifting data, which is 2.98% higher than the best available concept drift detection algorithms, while maintaining a low average false positive rate of 3.02%. |
first_indexed | 2024-04-24T10:35:27Z |
format | Article |
id | doaj.art-3e9d81095e9244a3a6f41e595fb81fcd |
institution | Directory Open Access Journal |
issn | 1424-8220 |
language | English |
last_indexed | 2024-04-24T10:35:27Z |
publishDate | 2024-03-01 |
publisher | MDPI AG |
record_format | Article |
series | Sensors |
spelling | doaj.art-3e9d81095e9244a3a6f41e595fb81fcd2024-04-12T13:26:10ZengMDPI AGSensors1424-82202024-03-01247208310.3390/s24072083GBDT-IL: Incremental Learning of Gradient Boosting Decision Trees to Detect Botnets in Internet of ThingsRuidong Chen0Tianci Dai1Yanfeng Zhang2Yukun Zhu3Xin Liu4Erfan Zhao5Institute for Cyber Security, School of Computer Science and Engineering, University of Electronic Science and Technology of China (UESTC), Chengdu 611731, ChinaInstitute for Cyber Security, School of Computer Science and Engineering, University of Electronic Science and Technology of China (UESTC), Chengdu 611731, ChinaIntelligent Policing Key Laboratory of Sichuan Province, Sichuan Police College, Luzhou 646000, ChinaInstitute for Cyber Security, School of Computer Science and Engineering, University of Electronic Science and Technology of China (UESTC), Chengdu 611731, ChinaSchool of Information Science and Engineering, Lanzhou University, Lanzhou 730000, ChinaInstitute for Cyber Security, School of Computer Science and Engineering, University of Electronic Science and Technology of China (UESTC), Chengdu 611731, ChinaThe rapid development of the Internet of Things (IoT) has brought many conveniences to our daily life. However, it has also introduced various security risks that need to be addressed. The proliferation of IoT botnets is one of these risks. Most of researchers have had some success in IoT botnet detection using artificial intelligence (AI). However, they have not considered the impact of dynamic network data streams on the models in real-world environments. Over time, existing detection models struggle to cope with evolving botnets. To address this challenge, we propose an incremental learning approach based on Gradient Boosting Decision Trees (GBDT), called GBDT-IL, for detecting botnet traffic in IoT environments. It improves the robustness of the framework by adapting to dynamic IoT data using incremental learning. Additionally, it incorporates an enhanced Fisher Score feature selection algorithm, which enables the model to achieve a high accuracy even with a smaller set of optimal features, thereby reducing the system resources required for model training. To evaluate the effectiveness of our approach, we conducted experiments on the BoT-IoT, N-BaIoT, MedBIoT, and MQTTSet datasets. We compared our method with similar feature selection algorithms and existing concept drift detection algorithms. The experimental results demonstrated that our method achieved an average accuracy of 99.81% using only 25 features, outperforming similar feature selection algorithms. Furthermore, our method achieved an average accuracy of 96.88% in the presence of different types of drifting data, which is 2.98% higher than the best available concept drift detection algorithms, while maintaining a low average false positive rate of 3.02%.https://www.mdpi.com/1424-8220/24/7/2083botnetsinternet of thingsfeature dimensionality reductionconcept drift |
spellingShingle | Ruidong Chen Tianci Dai Yanfeng Zhang Yukun Zhu Xin Liu Erfan Zhao GBDT-IL: Incremental Learning of Gradient Boosting Decision Trees to Detect Botnets in Internet of Things Sensors botnets internet of things feature dimensionality reduction concept drift |
title | GBDT-IL: Incremental Learning of Gradient Boosting Decision Trees to Detect Botnets in Internet of Things |
title_full | GBDT-IL: Incremental Learning of Gradient Boosting Decision Trees to Detect Botnets in Internet of Things |
title_fullStr | GBDT-IL: Incremental Learning of Gradient Boosting Decision Trees to Detect Botnets in Internet of Things |
title_full_unstemmed | GBDT-IL: Incremental Learning of Gradient Boosting Decision Trees to Detect Botnets in Internet of Things |
title_short | GBDT-IL: Incremental Learning of Gradient Boosting Decision Trees to Detect Botnets in Internet of Things |
title_sort | gbdt il incremental learning of gradient boosting decision trees to detect botnets in internet of things |
topic | botnets internet of things feature dimensionality reduction concept drift |
url | https://www.mdpi.com/1424-8220/24/7/2083 |
work_keys_str_mv | AT ruidongchen gbdtilincrementallearningofgradientboostingdecisiontreestodetectbotnetsininternetofthings AT tiancidai gbdtilincrementallearningofgradientboostingdecisiontreestodetectbotnetsininternetofthings AT yanfengzhang gbdtilincrementallearningofgradientboostingdecisiontreestodetectbotnetsininternetofthings AT yukunzhu gbdtilincrementallearningofgradientboostingdecisiontreestodetectbotnetsininternetofthings AT xinliu gbdtilincrementallearningofgradientboostingdecisiontreestodetectbotnetsininternetofthings AT erfanzhao gbdtilincrementallearningofgradientboostingdecisiontreestodetectbotnetsininternetofthings |