Behind the Bait: Delving into PhishTank's hidden data
Phishing constitutes a form of social engineering that aims to deceive individuals through email communication. Extensive prior research has underscored phishing as one of the most commonly employed attack vectors for infiltrating organizational networks. A prevalent method involves misleading the t...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2024-02-01
|
Series: | Data in Brief |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S2352340923009903 |
_version_ | 1827353645897744384 |
---|---|
author | Affan Yasin Rubia Fatima Javed Ali Khan Wasif Afzal |
author_facet | Affan Yasin Rubia Fatima Javed Ali Khan Wasif Afzal |
author_sort | Affan Yasin |
collection | DOAJ |
description | Phishing constitutes a form of social engineering that aims to deceive individuals through email communication. Extensive prior research has underscored phishing as one of the most commonly employed attack vectors for infiltrating organizational networks. A prevalent method involves misleading the target by employing phishing URLs concealed through hyperlink strategies. PhishTank, a website employing the concept of crowd-sourcing, aggregates phishing URLs and subsequently verifies their authenticity. In the course of this study, we leveraged a Python script to extract data from the PhishTank website, amassing a comprehensive dataset comprising over 190,0000 phishing URLs. This dataset is a valuable resource that can be harnessed by both researchers and practitioners for enhancing phish- ing filters, fortifying firewalls, security education, and refining training and testing models, among other applications. |
first_indexed | 2024-03-08T03:30:27Z |
format | Article |
id | doaj.art-959b41dde3f747e4bc5822d457d0e2aa |
institution | Directory Open Access Journal |
issn | 2352-3409 |
language | English |
last_indexed | 2024-03-08T03:30:27Z |
publishDate | 2024-02-01 |
publisher | Elsevier |
record_format | Article |
series | Data in Brief |
spelling | doaj.art-959b41dde3f747e4bc5822d457d0e2aa2024-02-11T05:10:42ZengElsevierData in Brief2352-34092024-02-0152109959Behind the Bait: Delving into PhishTank's hidden dataAffan Yasin0Rubia Fatima1Javed Ali Khan2Wasif Afzal3School of Software, Northwestern Polytechnical University, Xian 710072, Shaanxi, ChinaSchool of Software, Tsinghua University, Beijing, ChinaDepartment of Computer Science, School of Physics, Engineering & Computer Science, University of Hertfordshire, Hatfield, UKSchool of Innovation, Design and Engineering, Mälardalen University, Västerås, Sweden; Corresponding author.Phishing constitutes a form of social engineering that aims to deceive individuals through email communication. Extensive prior research has underscored phishing as one of the most commonly employed attack vectors for infiltrating organizational networks. A prevalent method involves misleading the target by employing phishing URLs concealed through hyperlink strategies. PhishTank, a website employing the concept of crowd-sourcing, aggregates phishing URLs and subsequently verifies their authenticity. In the course of this study, we leveraged a Python script to extract data from the PhishTank website, amassing a comprehensive dataset comprising over 190,0000 phishing URLs. This dataset is a valuable resource that can be harnessed by both researchers and practitioners for enhancing phish- ing filters, fortifying firewalls, security education, and refining training and testing models, among other applications.http://www.sciencedirect.com/science/article/pii/S2352340923009903Phished URLSocial engineeringEmail securityWeb securityComputer securityArtificial intelligence |
spellingShingle | Affan Yasin Rubia Fatima Javed Ali Khan Wasif Afzal Behind the Bait: Delving into PhishTank's hidden data Data in Brief Phished URL Social engineering Email security Web security Computer security Artificial intelligence |
title | Behind the Bait: Delving into PhishTank's hidden data |
title_full | Behind the Bait: Delving into PhishTank's hidden data |
title_fullStr | Behind the Bait: Delving into PhishTank's hidden data |
title_full_unstemmed | Behind the Bait: Delving into PhishTank's hidden data |
title_short | Behind the Bait: Delving into PhishTank's hidden data |
title_sort | behind the bait delving into phishtank s hidden data |
topic | Phished URL Social engineering Email security Web security Computer security Artificial intelligence |
url | http://www.sciencedirect.com/science/article/pii/S2352340923009903 |
work_keys_str_mv | AT affanyasin behindthebaitdelvingintophishtankshiddendata AT rubiafatima behindthebaitdelvingintophishtankshiddendata AT javedalikhan behindthebaitdelvingintophishtankshiddendata AT wasifafzal behindthebaitdelvingintophishtankshiddendata |