Identifying Persian bots on Twitter; which feature is more important: Account Information or Tweet Contents?

The spread of internet and smartphones in recent years has led to the popularity and easy accessibility of social networks among users. Despite the benefits of these networks, such as ease of interpersonal communication and providing a space for free expression of opinions, they also provide the opp...

Full description

Bibliographic Details
Main Authors: Mojtaba Mazoochi, Nasrin Asadi, Farzaneh Rahmani, Leila Rabiei
Format: Article
Language:English
Published: Iran Telecom Research Center 2023-02-01
Series:International Journal of Information and Communication Technology Research
Subjects:
Online Access:http://ijict.itrc.ac.ir/article-1-534-en.pdf
Description
Summary:The spread of internet and smartphones in recent years has led to the popularity and easy accessibility of social networks among users. Despite the benefits of these networks, such as ease of interpersonal communication and providing a space for free expression of opinions, they also provide the opportunity for destructive activities such as spreading false information or using fake accounts for fraud intentions. Fake accounts are mainly managed by bots. So, identifying bots and suspending them could very much help to increase the popularity and favorability of social networks. In this paper, we try to identify Persian bots on Twitter. This seems to be a challenging task in view of the problems pertinent to processing colloquial Persian. To this end, a set of features based on user account information and activity of users added to content features of tweets to classify users by several machine learning algorithms like Random Forest, Logistic Regression and SVM. The results of experiments on a dataset of Persian-language users show the proper performance of the proposed methods. It turns out that, achieving a balanced-accuracy of 93.86%, Random Forest is the most accurate classifier among those mentioned above.
ISSN:2251-6107
2783-4425