A Lightweight Multi-View Learning Approach for Phishing Attack Detection Using Transformer with Mixture of Experts

Phishing poses a significant threat to the financial and privacy security of internet users and often serves as the starting point for cyberattacks. Many machine-learning-based methods for detecting phishing websites rely on URL analysis, offering simplicity and efficiency. However, these approaches...

Full description

Bibliographic Details
Main Authors:	Yanbin Wang, Wenrui Ma, Haitao Xu, Yiwei Liu, Peng Yin
Format:	Article
Language:	English
Published:	MDPI AG 2023-06-01
Series:	Applied Sciences
Subjects:	phishing attack detection multi-view learning transformer self-supervised learning
Online Access:	https://www.mdpi.com/2076-3417/13/13/7429

_version_	1797592191112052736
author	Yanbin Wang Wenrui Ma Haitao Xu Yiwei Liu Peng Yin
author_facet	Yanbin Wang Wenrui Ma Haitao Xu Yiwei Liu Peng Yin
author_sort	Yanbin Wang
collection	DOAJ
description	Phishing poses a significant threat to the financial and privacy security of internet users and often serves as the starting point for cyberattacks. Many machine-learning-based methods for detecting phishing websites rely on URL analysis, offering simplicity and efficiency. However, these approaches are not always effective due to the following reasons: (1) highly concealed phishing websites may employ tactics such as masquerading URL addresses to deceive machine learning models, and (2) phishing attackers frequently change their phishing website URLs to evade detection. In this study, we propose a robust, multi-view Transformer model with an expert-mixture mechanism for accurate phishing website detection utilizing website URLs, attributes, content, and behavioral information. Specifically, we first adapted a pretrained language model for URL representation learning by applying adversarial post-training learning in order to extract semantic information from URLs. Next, we captured the attribute, content, and behavioral features of the websites and encoded them as vectors, which, alongside the URL embeddings, constitute the website’s multi-view information. Subsequently, we introduced a mixture-of-experts mechanism into the Transformer network to learn knowledge from different views and adaptively fuse information from various views. The proposed method outperforms state-of-the-art approaches in evaluations of real phishing websites, demonstrating greater performance with less label dependency. Furthermore, we show the superior robustness and enhanced adaptability of the proposed method to unseen samples and data drift in more challenging experimental settings.
first_indexed	2024-03-11T01:47:57Z
format	Article
id	doaj.art-478833d15884446a8bf9bd638739312f
institution	Directory Open Access Journal
issn	2076-3417
language	English
last_indexed	2024-03-11T01:47:57Z
publishDate	2023-06-01
publisher	MDPI AG
record_format	Article
series	Applied Sciences
spelling	doaj.art-478833d15884446a8bf9bd638739312f2023-11-18T16:06:16ZengMDPI AGApplied Sciences2076-34172023-06-011313742910.3390/app13137429A Lightweight Multi-View Learning Approach for Phishing Attack Detection Using Transformer with Mixture of ExpertsYanbin Wang0Wenrui Ma1Haitao Xu2Yiwei Liu3Peng Yin4School of Cyber and Technology, Zhejiang University, Hangzhou 310027, ChinaSchool of Cyber and Technology, Zhejiang University, Hangzhou 310027, ChinaSchool of Cyber and Technology, Zhejiang University, Hangzhou 310027, ChinaDefence Industry Secrecy Examination and Certification Center, Beijing 100089, ChinaDefence Industry Secrecy Examination and Certification Center, Beijing 100089, ChinaPhishing poses a significant threat to the financial and privacy security of internet users and often serves as the starting point for cyberattacks. Many machine-learning-based methods for detecting phishing websites rely on URL analysis, offering simplicity and efficiency. However, these approaches are not always effective due to the following reasons: (1) highly concealed phishing websites may employ tactics such as masquerading URL addresses to deceive machine learning models, and (2) phishing attackers frequently change their phishing website URLs to evade detection. In this study, we propose a robust, multi-view Transformer model with an expert-mixture mechanism for accurate phishing website detection utilizing website URLs, attributes, content, and behavioral information. Specifically, we first adapted a pretrained language model for URL representation learning by applying adversarial post-training learning in order to extract semantic information from URLs. Next, we captured the attribute, content, and behavioral features of the websites and encoded them as vectors, which, alongside the URL embeddings, constitute the website’s multi-view information. Subsequently, we introduced a mixture-of-experts mechanism into the Transformer network to learn knowledge from different views and adaptively fuse information from various views. The proposed method outperforms state-of-the-art approaches in evaluations of real phishing websites, demonstrating greater performance with less label dependency. Furthermore, we show the superior robustness and enhanced adaptability of the proposed method to unseen samples and data drift in more challenging experimental settings.https://www.mdpi.com/2076-3417/13/13/7429phishing attack detectionmulti-view learningtransformerself-supervised learning
spellingShingle	Yanbin Wang Wenrui Ma Haitao Xu Yiwei Liu Peng Yin A Lightweight Multi-View Learning Approach for Phishing Attack Detection Using Transformer with Mixture of Experts Applied Sciences phishing attack detection multi-view learning transformer self-supervised learning
title	A Lightweight Multi-View Learning Approach for Phishing Attack Detection Using Transformer with Mixture of Experts
title_full	A Lightweight Multi-View Learning Approach for Phishing Attack Detection Using Transformer with Mixture of Experts
title_fullStr	A Lightweight Multi-View Learning Approach for Phishing Attack Detection Using Transformer with Mixture of Experts
title_full_unstemmed	A Lightweight Multi-View Learning Approach for Phishing Attack Detection Using Transformer with Mixture of Experts
title_short	A Lightweight Multi-View Learning Approach for Phishing Attack Detection Using Transformer with Mixture of Experts
title_sort	lightweight multi view learning approach for phishing attack detection using transformer with mixture of experts
topic	phishing attack detection multi-view learning transformer self-supervised learning
url	https://www.mdpi.com/2076-3417/13/13/7429
work_keys_str_mv	AT yanbinwang alightweightmultiviewlearningapproachforphishingattackdetectionusingtransformerwithmixtureofexperts AT wenruima alightweightmultiviewlearningapproachforphishingattackdetectionusingtransformerwithmixtureofexperts AT haitaoxu alightweightmultiviewlearningapproachforphishingattackdetectionusingtransformerwithmixtureofexperts AT yiweiliu alightweightmultiviewlearningapproachforphishingattackdetectionusingtransformerwithmixtureofexperts AT pengyin alightweightmultiviewlearningapproachforphishingattackdetectionusingtransformerwithmixtureofexperts AT yanbinwang lightweightmultiviewlearningapproachforphishingattackdetectionusingtransformerwithmixtureofexperts AT wenruima lightweightmultiviewlearningapproachforphishingattackdetectionusingtransformerwithmixtureofexperts AT haitaoxu lightweightmultiviewlearningapproachforphishingattackdetectionusingtransformerwithmixtureofexperts AT yiweiliu lightweightmultiviewlearningapproachforphishingattackdetectionusingtransformerwithmixtureofexperts AT pengyin lightweightmultiviewlearningapproachforphishingattackdetectionusingtransformerwithmixtureofexperts

A Lightweight Multi-View Learning Approach for Phishing Attack Detection Using Transformer with Mixture of Experts

Similar Items