Twitter-based gender recognition using transformers

Social media contains useful information about people and society that could help advance research in many different areas of health (e.g. by applying opinion mining, emotion/sentiment analysis and statistical analysis) such as mental health, health surveillance, socio-economic inequality and gender...

Full description

Bibliographic Details
Main Authors: Zahra Movahedi Nia, Ali Ahmadi, Bruce Mellado, Jianhong Wu, James Orbinski, Ali Asgary, Jude D. Kong
Format: Article
Language:English
Published: AIMS Press 2023-08-01
Series:Mathematical Biosciences and Engineering
Subjects:
Online Access:https://www.aimspress.com/article/doi/10.3934/mbe.2023711?viewType=HTML
_version_ 1797690990049361920
author Zahra Movahedi Nia
Ali Ahmadi
Bruce Mellado
Jianhong Wu
James Orbinski
Ali Asgary
Jude D. Kong
author_facet Zahra Movahedi Nia
Ali Ahmadi
Bruce Mellado
Jianhong Wu
James Orbinski
Ali Asgary
Jude D. Kong
author_sort Zahra Movahedi Nia
collection DOAJ
description Social media contains useful information about people and society that could help advance research in many different areas of health (e.g. by applying opinion mining, emotion/sentiment analysis and statistical analysis) such as mental health, health surveillance, socio-economic inequality and gender vulnerability. User demographics provide rich information that could help study the subject further. However, user demographics such as gender are considered private and are not freely available. In this study, we propose a model based on transformers to predict the user's gender from their images and tweets. The image-based classification model is trained in two different methods: using the profile image of the user and using various image contents posted by the user on Twitter. For the first method a Twitter gender recognition dataset, publicly available on Kaggle and for the second method the PAN-18 dataset is used. Several transformer models, i.e. vision transformers (ViT), LeViT and Swin Transformer are fine-tuned for both of the image datasets and then compared. Next, different transformer models, namely, bidirectional encoders representations from transformers (BERT), RoBERTa and ELECTRA are fine-tuned to recognize the user's gender by their tweets. This is highly beneficial, because not all users provide an image that indicates their gender. The gender of such users could be detected from their tweets. The significance of the image and text classification models were evaluated using the Mann-Whitney U test. Finally, the combination model improved the accuracy of image and text classification models by 11.73 and 5.26% for the Kaggle dataset and by 8.55 and 9.8% for the PAN-18 dataset, respectively. This shows that the image and text classification models are capable of complementing each other by providing additional information to one another. Our overall multimodal method has an accuracy of 88.11% for the Kaggle and 89.24% for the PAN-18 dataset and outperforms state-of-the-art models. Our work benefits research that critically require user demographic information such as gender to further analyze and study social media content for health-related issues.
first_indexed 2024-03-12T02:08:07Z
format Article
id doaj.art-5321bbea3d154f3d8501c49ca42c0a30
institution Directory Open Access Journal
issn 1551-0018
language English
last_indexed 2024-03-12T02:08:07Z
publishDate 2023-08-01
publisher AIMS Press
record_format Article
series Mathematical Biosciences and Engineering
spelling doaj.art-5321bbea3d154f3d8501c49ca42c0a302023-09-07T01:12:15ZengAIMS PressMathematical Biosciences and Engineering1551-00182023-08-01209159621598110.3934/mbe.2023711Twitter-based gender recognition using transformersZahra Movahedi Nia0Ali Ahmadi1Bruce Mellado2Jianhong Wu3James Orbinski 4Ali Asgary5Jude D. Kong 61. Africa-Canada Artificial Intelligence and Data Innovation Consortium (ACADIC), York University, Canada 2. Laboratory for Industrial and Applied Mathematics, York University, Canada3. K.N Toosi University, Faculty of Computer Engineering, Tehran, Iran 4. Advanced Disaster, Emergency and Rapid-Response Simulation (ADERSIM), York University, Toronto, Ontario, Canada1. Africa-Canada Artificial Intelligence and Data Innovation Consortium (ACADIC), York University, Canada5. School of Physics, Institute for Collider Particle Physics, University of Witwatersrand, Johannesburg, South Africa1. Africa-Canada Artificial Intelligence and Data Innovation Consortium (ACADIC), York University, Canada 2. Laboratory for Industrial and Applied Mathematics, York University, Canada1. Africa-Canada Artificial Intelligence and Data Innovation Consortium (ACADIC), York University, Canada6. Dahdaleh Institute for Global Health Research, York University, Canada1. Africa-Canada Artificial Intelligence and Data Innovation Consortium (ACADIC), York University, Canada4. Advanced Disaster, Emergency and Rapid-Response Simulation (ADERSIM), York University, Toronto, Ontario, Canada1. Africa-Canada Artificial Intelligence and Data Innovation Consortium (ACADIC), York University, Canada 2. Laboratory for Industrial and Applied Mathematics, York University, CanadaSocial media contains useful information about people and society that could help advance research in many different areas of health (e.g. by applying opinion mining, emotion/sentiment analysis and statistical analysis) such as mental health, health surveillance, socio-economic inequality and gender vulnerability. User demographics provide rich information that could help study the subject further. However, user demographics such as gender are considered private and are not freely available. In this study, we propose a model based on transformers to predict the user's gender from their images and tweets. The image-based classification model is trained in two different methods: using the profile image of the user and using various image contents posted by the user on Twitter. For the first method a Twitter gender recognition dataset, publicly available on Kaggle and for the second method the PAN-18 dataset is used. Several transformer models, i.e. vision transformers (ViT), LeViT and Swin Transformer are fine-tuned for both of the image datasets and then compared. Next, different transformer models, namely, bidirectional encoders representations from transformers (BERT), RoBERTa and ELECTRA are fine-tuned to recognize the user's gender by their tweets. This is highly beneficial, because not all users provide an image that indicates their gender. The gender of such users could be detected from their tweets. The significance of the image and text classification models were evaluated using the Mann-Whitney U test. Finally, the combination model improved the accuracy of image and text classification models by 11.73 and 5.26% for the Kaggle dataset and by 8.55 and 9.8% for the PAN-18 dataset, respectively. This shows that the image and text classification models are capable of complementing each other by providing additional information to one another. Our overall multimodal method has an accuracy of 88.11% for the Kaggle and 89.24% for the PAN-18 dataset and outperforms state-of-the-art models. Our work benefits research that critically require user demographic information such as gender to further analyze and study social media content for health-related issues.https://www.aimspress.com/article/doi/10.3934/mbe.2023711?viewType=HTMLbertelectragender recognitionlevitrobertasocial mediaswin transformertransformersvit
spellingShingle Zahra Movahedi Nia
Ali Ahmadi
Bruce Mellado
Jianhong Wu
James Orbinski
Ali Asgary
Jude D. Kong
Twitter-based gender recognition using transformers
Mathematical Biosciences and Engineering
bert
electra
gender recognition
levit
roberta
social media
swin transformer
transformers
vit
title Twitter-based gender recognition using transformers
title_full Twitter-based gender recognition using transformers
title_fullStr Twitter-based gender recognition using transformers
title_full_unstemmed Twitter-based gender recognition using transformers
title_short Twitter-based gender recognition using transformers
title_sort twitter based gender recognition using transformers
topic bert
electra
gender recognition
levit
roberta
social media
swin transformer
transformers
vit
url https://www.aimspress.com/article/doi/10.3934/mbe.2023711?viewType=HTML
work_keys_str_mv AT zahramovahedinia twitterbasedgenderrecognitionusingtransformers
AT aliahmadi twitterbasedgenderrecognitionusingtransformers
AT brucemellado twitterbasedgenderrecognitionusingtransformers
AT jianhongwu twitterbasedgenderrecognitionusingtransformers
AT jamesorbinski twitterbasedgenderrecognitionusingtransformers
AT aliasgary twitterbasedgenderrecognitionusingtransformers
AT judedkong twitterbasedgenderrecognitionusingtransformers