A Heterogeneous Machine Learning Ensemble Framework for Malicious Webpage Detection
The growing dependence on digital systems has heightened the risks posed by cybersecurity threats. This paper proposes a new method for detecting malicious webpages among several adversary activities. As shown in previous studies, malicious URL detection performance is significantly affected by the...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-11-01
|
Series: | Applied Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-3417/12/23/12070 |
_version_ | 1797463662880882688 |
---|---|
author | Sam-Shin Shin Seung-Goo Ji Sung-Sam Hong |
author_facet | Sam-Shin Shin Seung-Goo Ji Sung-Sam Hong |
author_sort | Sam-Shin Shin |
collection | DOAJ |
description | The growing dependence on digital systems has heightened the risks posed by cybersecurity threats. This paper proposes a new method for detecting malicious webpages among several adversary activities. As shown in previous studies, malicious URL detection performance is significantly affected by the learning dataset features. The overall performance of different machine learning models varies depending on the data features, and using a particular model alone is not always desirable in any given environment. To address these limitations, we propose an ensemble approach using different machine learning models. Our proposed method outperforms the existing single model by 6%, allowing for the detection of an additional 141 malicious URLs. In this study, repetitive tasks are automated, improving the performance of different machine learning models. In addition, the proposed framework builds an advanced feature set based on URL and web content and includes the most optimized detection model structure. The proposed technology can contribute to define an advanced feature set based on URL and web content and includes the most optimized detection model structure and research on automated technology for the detection of malicious websites, such as phishing websites and malicious code distribution. |
first_indexed | 2024-03-09T17:53:57Z |
format | Article |
id | doaj.art-1e435e86850e41e49bfcc2ceaca10a01 |
institution | Directory Open Access Journal |
issn | 2076-3417 |
language | English |
last_indexed | 2024-03-09T17:53:57Z |
publishDate | 2022-11-01 |
publisher | MDPI AG |
record_format | Article |
series | Applied Sciences |
spelling | doaj.art-1e435e86850e41e49bfcc2ceaca10a012023-11-24T10:30:30ZengMDPI AGApplied Sciences2076-34172022-11-0112231207010.3390/app122312070A Heterogeneous Machine Learning Ensemble Framework for Malicious Webpage DetectionSam-Shin Shin0Seung-Goo Ji1Sung-Sam Hong2Internet Incident Response Technology Team, Korea Internet & Security Agency, Naju 58324, Republic of KoreaInternet Incident Response Technology Team, Korea Internet & Security Agency, Naju 58324, Republic of KoreaDepartment of Multimedia Contents, Jangan University, Hwaseong 18331, Republic of KoreaThe growing dependence on digital systems has heightened the risks posed by cybersecurity threats. This paper proposes a new method for detecting malicious webpages among several adversary activities. As shown in previous studies, malicious URL detection performance is significantly affected by the learning dataset features. The overall performance of different machine learning models varies depending on the data features, and using a particular model alone is not always desirable in any given environment. To address these limitations, we propose an ensemble approach using different machine learning models. Our proposed method outperforms the existing single model by 6%, allowing for the detection of an additional 141 malicious URLs. In this study, repetitive tasks are automated, improving the performance of different machine learning models. In addition, the proposed framework builds an advanced feature set based on URL and web content and includes the most optimized detection model structure. The proposed technology can contribute to define an advanced feature set based on URL and web content and includes the most optimized detection model structure and research on automated technology for the detection of malicious websites, such as phishing websites and malicious code distribution.https://www.mdpi.com/2076-3417/12/23/12070securitymalicious URL detectionmachine learningensemble learningartificial intelligence |
spellingShingle | Sam-Shin Shin Seung-Goo Ji Sung-Sam Hong A Heterogeneous Machine Learning Ensemble Framework for Malicious Webpage Detection Applied Sciences security malicious URL detection machine learning ensemble learning artificial intelligence |
title | A Heterogeneous Machine Learning Ensemble Framework for Malicious Webpage Detection |
title_full | A Heterogeneous Machine Learning Ensemble Framework for Malicious Webpage Detection |
title_fullStr | A Heterogeneous Machine Learning Ensemble Framework for Malicious Webpage Detection |
title_full_unstemmed | A Heterogeneous Machine Learning Ensemble Framework for Malicious Webpage Detection |
title_short | A Heterogeneous Machine Learning Ensemble Framework for Malicious Webpage Detection |
title_sort | heterogeneous machine learning ensemble framework for malicious webpage detection |
topic | security malicious URL detection machine learning ensemble learning artificial intelligence |
url | https://www.mdpi.com/2076-3417/12/23/12070 |
work_keys_str_mv | AT samshinshin aheterogeneousmachinelearningensembleframeworkformaliciouswebpagedetection AT seunggooji aheterogeneousmachinelearningensembleframeworkformaliciouswebpagedetection AT sungsamhong aheterogeneousmachinelearningensembleframeworkformaliciouswebpagedetection AT samshinshin heterogeneousmachinelearningensembleframeworkformaliciouswebpagedetection AT seunggooji heterogeneousmachinelearningensembleframeworkformaliciouswebpagedetection AT sungsamhong heterogeneousmachinelearningensembleframeworkformaliciouswebpagedetection |