Intelligent Framework for Detecting Predatory Publishing Venues
Predatory publishing venues publish questionable articles and pose a global threat to the integrity and quality of the scientific literature. They have given rise to the dark side of scholarly publishing and their effects have reached political, societal, economic, and health aspects. Given their co...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2023-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10056132/ |
_version_ | 1827997850460487680 |
---|---|
author | Wed Majed Bin Ateeq Hend S. Al-Khalifa |
author_facet | Wed Majed Bin Ateeq Hend S. Al-Khalifa |
author_sort | Wed Majed Bin Ateeq |
collection | DOAJ |
description | Predatory publishing venues publish questionable articles and pose a global threat to the integrity and quality of the scientific literature. They have given rise to the dark side of scholarly publishing and their effects have reached political, societal, economic, and health aspects. Given their consequences and proliferation, several solutions have been developed to help detect them; however, these solutions are manual and time-consuming. While researchers, students, and readers are in need of a tool that automatically detects predatory venues and their violations, in this study, we proposed an intelligent framework that can automatically detect predatory venues and their violations using different artificial intelligence techniques. This work contributes through the following: (1) creating a dataset of 9,866 journals annotated as predatory and legitimate, and (2) proposing an intelligent framework for classifying a venue as legitimate or predatory, with appropriate reasoning. Our framework was evaluated using seven different machine learning and deep learning models, including Support Vector Machine (SVM), K-Nearest Neighbors (KNN), Neural Networks (NNs), Long short-term memory (LSTM), Convolutional Neural Network (CNN), Bidirectional Encoders from Transformers (BERT), A Lite BERT (ALBERT), and different feature representation techniques. The results showed that the CNN model outperformed the other models in journal classification task, with an F1 score of 0.96. For appropriate reasoning of the provisioning task, the SVM model achieved the best micro F1 of 0.67. |
first_indexed | 2024-04-10T05:35:20Z |
format | Article |
id | doaj.art-2be551ac3a53424d89ef49e335c97f21 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-04-10T05:35:20Z |
publishDate | 2023-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-2be551ac3a53424d89ef49e335c97f212023-03-07T00:01:12ZengIEEEIEEE Access2169-35362023-01-0111205822061810.1109/ACCESS.2023.325025610056132Intelligent Framework for Detecting Predatory Publishing VenuesWed Majed Bin Ateeq0https://orcid.org/0000-0002-1344-3746Hend S. Al-Khalifa1https://orcid.org/0000-0002-7328-4935Department of Information Technology, Imam Mohammad Ibn Saud Islamic University, Riyadh, Saudi ArabiaDepartment of Information Technology, King Saud University, Riyadh, Saudi ArabiaPredatory publishing venues publish questionable articles and pose a global threat to the integrity and quality of the scientific literature. They have given rise to the dark side of scholarly publishing and their effects have reached political, societal, economic, and health aspects. Given their consequences and proliferation, several solutions have been developed to help detect them; however, these solutions are manual and time-consuming. While researchers, students, and readers are in need of a tool that automatically detects predatory venues and their violations, in this study, we proposed an intelligent framework that can automatically detect predatory venues and their violations using different artificial intelligence techniques. This work contributes through the following: (1) creating a dataset of 9,866 journals annotated as predatory and legitimate, and (2) proposing an intelligent framework for classifying a venue as legitimate or predatory, with appropriate reasoning. Our framework was evaluated using seven different machine learning and deep learning models, including Support Vector Machine (SVM), K-Nearest Neighbors (KNN), Neural Networks (NNs), Long short-term memory (LSTM), Convolutional Neural Network (CNN), Bidirectional Encoders from Transformers (BERT), A Lite BERT (ALBERT), and different feature representation techniques. The results showed that the CNN model outperformed the other models in journal classification task, with an F1 score of 0.96. For appropriate reasoning of the provisioning task, the SVM model achieved the best micro F1 of 0.67.https://ieeexplore.ieee.org/document/10056132/Automatic detectiondeceptive publishingfake website detectiondeep learningmachine learningpredatory venues |
spellingShingle | Wed Majed Bin Ateeq Hend S. Al-Khalifa Intelligent Framework for Detecting Predatory Publishing Venues IEEE Access Automatic detection deceptive publishing fake website detection deep learning machine learning predatory venues |
title | Intelligent Framework for Detecting Predatory Publishing Venues |
title_full | Intelligent Framework for Detecting Predatory Publishing Venues |
title_fullStr | Intelligent Framework for Detecting Predatory Publishing Venues |
title_full_unstemmed | Intelligent Framework for Detecting Predatory Publishing Venues |
title_short | Intelligent Framework for Detecting Predatory Publishing Venues |
title_sort | intelligent framework for detecting predatory publishing venues |
topic | Automatic detection deceptive publishing fake website detection deep learning machine learning predatory venues |
url | https://ieeexplore.ieee.org/document/10056132/ |
work_keys_str_mv | AT wedmajedbinateeq intelligentframeworkfordetectingpredatorypublishingvenues AT hendsalkhalifa intelligentframeworkfordetectingpredatorypublishingvenues |