Twitter-based gender recognition using transformers
Social media contains useful information about people and society that could help advance research in many different areas of health (e.g. by applying opinion mining, emotion/sentiment analysis and statistical analysis) such as mental health, health surveillance, socio-economic inequality and gender...
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
AIMS Press
2023-08-01
|
Series: | Mathematical Biosciences and Engineering |
Subjects: | |
Online Access: | https://www.aimspress.com/article/doi/10.3934/mbe.2023711?viewType=HTML |
_version_ | 1797690990049361920 |
---|---|
author | Zahra Movahedi Nia Ali Ahmadi Bruce Mellado Jianhong Wu James Orbinski Ali Asgary Jude D. Kong |
author_facet | Zahra Movahedi Nia Ali Ahmadi Bruce Mellado Jianhong Wu James Orbinski Ali Asgary Jude D. Kong |
author_sort | Zahra Movahedi Nia |
collection | DOAJ |
description | Social media contains useful information about people and society that could help advance research in many different areas of health (e.g. by applying opinion mining, emotion/sentiment analysis and statistical analysis) such as mental health, health surveillance, socio-economic inequality and gender vulnerability. User demographics provide rich information that could help study the subject further. However, user demographics such as gender are considered private and are not freely available. In this study, we propose a model based on transformers to predict the user's gender from their images and tweets. The image-based classification model is trained in two different methods: using the profile image of the user and using various image contents posted by the user on Twitter. For the first method a Twitter gender recognition dataset, publicly available on Kaggle and for the second method the PAN-18 dataset is used. Several transformer models, i.e. vision transformers (ViT), LeViT and Swin Transformer are fine-tuned for both of the image datasets and then compared. Next, different transformer models, namely, bidirectional encoders representations from transformers (BERT), RoBERTa and ELECTRA are fine-tuned to recognize the user's gender by their tweets. This is highly beneficial, because not all users provide an image that indicates their gender. The gender of such users could be detected from their tweets. The significance of the image and text classification models were evaluated using the Mann-Whitney U test. Finally, the combination model improved the accuracy of image and text classification models by 11.73 and 5.26% for the Kaggle dataset and by 8.55 and 9.8% for the PAN-18 dataset, respectively. This shows that the image and text classification models are capable of complementing each other by providing additional information to one another. Our overall multimodal method has an accuracy of 88.11% for the Kaggle and 89.24% for the PAN-18 dataset and outperforms state-of-the-art models. Our work benefits research that critically require user demographic information such as gender to further analyze and study social media content for health-related issues. |
first_indexed | 2024-03-12T02:08:07Z |
format | Article |
id | doaj.art-5321bbea3d154f3d8501c49ca42c0a30 |
institution | Directory Open Access Journal |
issn | 1551-0018 |
language | English |
last_indexed | 2024-03-12T02:08:07Z |
publishDate | 2023-08-01 |
publisher | AIMS Press |
record_format | Article |
series | Mathematical Biosciences and Engineering |
spelling | doaj.art-5321bbea3d154f3d8501c49ca42c0a302023-09-07T01:12:15ZengAIMS PressMathematical Biosciences and Engineering1551-00182023-08-01209159621598110.3934/mbe.2023711Twitter-based gender recognition using transformersZahra Movahedi Nia0Ali Ahmadi1Bruce Mellado2Jianhong Wu3James Orbinski 4Ali Asgary5Jude D. Kong 61. Africa-Canada Artificial Intelligence and Data Innovation Consortium (ACADIC), York University, Canada 2. Laboratory for Industrial and Applied Mathematics, York University, Canada3. K.N Toosi University, Faculty of Computer Engineering, Tehran, Iran 4. Advanced Disaster, Emergency and Rapid-Response Simulation (ADERSIM), York University, Toronto, Ontario, Canada1. Africa-Canada Artificial Intelligence and Data Innovation Consortium (ACADIC), York University, Canada5. School of Physics, Institute for Collider Particle Physics, University of Witwatersrand, Johannesburg, South Africa1. Africa-Canada Artificial Intelligence and Data Innovation Consortium (ACADIC), York University, Canada 2. Laboratory for Industrial and Applied Mathematics, York University, Canada1. Africa-Canada Artificial Intelligence and Data Innovation Consortium (ACADIC), York University, Canada6. Dahdaleh Institute for Global Health Research, York University, Canada1. Africa-Canada Artificial Intelligence and Data Innovation Consortium (ACADIC), York University, Canada4. Advanced Disaster, Emergency and Rapid-Response Simulation (ADERSIM), York University, Toronto, Ontario, Canada1. Africa-Canada Artificial Intelligence and Data Innovation Consortium (ACADIC), York University, Canada 2. Laboratory for Industrial and Applied Mathematics, York University, CanadaSocial media contains useful information about people and society that could help advance research in many different areas of health (e.g. by applying opinion mining, emotion/sentiment analysis and statistical analysis) such as mental health, health surveillance, socio-economic inequality and gender vulnerability. User demographics provide rich information that could help study the subject further. However, user demographics such as gender are considered private and are not freely available. In this study, we propose a model based on transformers to predict the user's gender from their images and tweets. The image-based classification model is trained in two different methods: using the profile image of the user and using various image contents posted by the user on Twitter. For the first method a Twitter gender recognition dataset, publicly available on Kaggle and for the second method the PAN-18 dataset is used. Several transformer models, i.e. vision transformers (ViT), LeViT and Swin Transformer are fine-tuned for both of the image datasets and then compared. Next, different transformer models, namely, bidirectional encoders representations from transformers (BERT), RoBERTa and ELECTRA are fine-tuned to recognize the user's gender by their tweets. This is highly beneficial, because not all users provide an image that indicates their gender. The gender of such users could be detected from their tweets. The significance of the image and text classification models were evaluated using the Mann-Whitney U test. Finally, the combination model improved the accuracy of image and text classification models by 11.73 and 5.26% for the Kaggle dataset and by 8.55 and 9.8% for the PAN-18 dataset, respectively. This shows that the image and text classification models are capable of complementing each other by providing additional information to one another. Our overall multimodal method has an accuracy of 88.11% for the Kaggle and 89.24% for the PAN-18 dataset and outperforms state-of-the-art models. Our work benefits research that critically require user demographic information such as gender to further analyze and study social media content for health-related issues.https://www.aimspress.com/article/doi/10.3934/mbe.2023711?viewType=HTMLbertelectragender recognitionlevitrobertasocial mediaswin transformertransformersvit |
spellingShingle | Zahra Movahedi Nia Ali Ahmadi Bruce Mellado Jianhong Wu James Orbinski Ali Asgary Jude D. Kong Twitter-based gender recognition using transformers Mathematical Biosciences and Engineering bert electra gender recognition levit roberta social media swin transformer transformers vit |
title | Twitter-based gender recognition using transformers |
title_full | Twitter-based gender recognition using transformers |
title_fullStr | Twitter-based gender recognition using transformers |
title_full_unstemmed | Twitter-based gender recognition using transformers |
title_short | Twitter-based gender recognition using transformers |
title_sort | twitter based gender recognition using transformers |
topic | bert electra gender recognition levit roberta social media swin transformer transformers vit |
url | https://www.aimspress.com/article/doi/10.3934/mbe.2023711?viewType=HTML |
work_keys_str_mv | AT zahramovahedinia twitterbasedgenderrecognitionusingtransformers AT aliahmadi twitterbasedgenderrecognitionusingtransformers AT brucemellado twitterbasedgenderrecognitionusingtransformers AT jianhongwu twitterbasedgenderrecognitionusingtransformers AT jamesorbinski twitterbasedgenderrecognitionusingtransformers AT aliasgary twitterbasedgenderrecognitionusingtransformers AT judedkong twitterbasedgenderrecognitionusingtransformers |