COVID-19 Vaccine Hesitancy on Social Media: Building a Public Twitter Data Set of Antivaccine Content, Vaccine Misinformation, and Conspiracies

BackgroundFalse claims about COVID-19 vaccines can undermine public trust in ongoing vaccination campaigns, posing a threat to global public health. Misinformation originating from various sources has been spreading on the web since the beginning of the COVID-19 pandemic. Ant...

Full description

Bibliographic Details
Main Authors: Goran Muric, Yusong Wu, Emilio Ferrara
Format: Article
Language:English
Published: JMIR Publications 2021-11-01
Series:JMIR Public Health and Surveillance
Online Access:https://publichealth.jmir.org/2021/11/e30642
_version_ 1797735537065328640
author Goran Muric
Yusong Wu
Emilio Ferrara
author_facet Goran Muric
Yusong Wu
Emilio Ferrara
author_sort Goran Muric
collection DOAJ
description BackgroundFalse claims about COVID-19 vaccines can undermine public trust in ongoing vaccination campaigns, posing a threat to global public health. Misinformation originating from various sources has been spreading on the web since the beginning of the COVID-19 pandemic. Antivaccine activists have also begun to use platforms such as Twitter to promote their views. To properly understand the phenomenon of vaccine hesitancy through the lens of social media, it is of great importance to gather the relevant data. ObjectiveIn this paper, we describe a data set of Twitter posts and Twitter accounts that publicly exhibit a strong antivaccine stance. The data set is made available to the research community via our AvaxTweets data set GitHub repository. We characterize the collected accounts in terms of prominent hashtags, shared news sources, and most likely political leaning. MethodsWe started the ongoing data collection on October 18, 2020, leveraging the Twitter streaming application programming interface (API) to follow a set of specific antivaccine-related keywords. Then, we collected the historical tweets of the set of accounts that engaged in spreading antivaccination narratives between October 2020 and December 2020, leveraging the Academic Track Twitter API. The political leaning of the accounts was estimated by measuring the political bias of the media outlets they shared. ResultsWe gathered two curated Twitter data collections and made them publicly available: (1) a streaming keyword–centered data collection with more than 1.8 million tweets, and (2) a historical account–level data collection with more than 135 million tweets. The accounts engaged in the antivaccination narratives lean to the right (conservative) direction of the political spectrum. The vaccine hesitancy is fueled by misinformation originating from websites with already questionable credibility. ConclusionsThe vaccine-related misinformation on social media may exacerbate the levels of vaccine hesitancy, hampering progress toward vaccine-induced herd immunity, and could potentially increase the number of infections related to new COVID-19 variants. For these reasons, understanding vaccine hesitancy through the lens of social media is of paramount importance. Because data access is the first obstacle to attain this goal, we published a data set that can be used in studying antivaccine misinformation on social media and enable a better understanding of vaccine hesitancy.
first_indexed 2024-03-12T13:00:30Z
format Article
id doaj.art-7674fe97fdfd4982a8289e4106a6aee7
institution Directory Open Access Journal
issn 2369-2960
language English
last_indexed 2024-03-12T13:00:30Z
publishDate 2021-11-01
publisher JMIR Publications
record_format Article
series JMIR Public Health and Surveillance
spelling doaj.art-7674fe97fdfd4982a8289e4106a6aee72023-08-28T19:47:36ZengJMIR PublicationsJMIR Public Health and Surveillance2369-29602021-11-01711e3064210.2196/30642COVID-19 Vaccine Hesitancy on Social Media: Building a Public Twitter Data Set of Antivaccine Content, Vaccine Misinformation, and ConspiraciesGoran Murichttps://orcid.org/0000-0002-3700-2347Yusong Wuhttps://orcid.org/0000-0001-6692-3607Emilio Ferrarahttps://orcid.org/0000-0002-1942-2831 BackgroundFalse claims about COVID-19 vaccines can undermine public trust in ongoing vaccination campaigns, posing a threat to global public health. Misinformation originating from various sources has been spreading on the web since the beginning of the COVID-19 pandemic. Antivaccine activists have also begun to use platforms such as Twitter to promote their views. To properly understand the phenomenon of vaccine hesitancy through the lens of social media, it is of great importance to gather the relevant data. ObjectiveIn this paper, we describe a data set of Twitter posts and Twitter accounts that publicly exhibit a strong antivaccine stance. The data set is made available to the research community via our AvaxTweets data set GitHub repository. We characterize the collected accounts in terms of prominent hashtags, shared news sources, and most likely political leaning. MethodsWe started the ongoing data collection on October 18, 2020, leveraging the Twitter streaming application programming interface (API) to follow a set of specific antivaccine-related keywords. Then, we collected the historical tweets of the set of accounts that engaged in spreading antivaccination narratives between October 2020 and December 2020, leveraging the Academic Track Twitter API. The political leaning of the accounts was estimated by measuring the political bias of the media outlets they shared. ResultsWe gathered two curated Twitter data collections and made them publicly available: (1) a streaming keyword–centered data collection with more than 1.8 million tweets, and (2) a historical account–level data collection with more than 135 million tweets. The accounts engaged in the antivaccination narratives lean to the right (conservative) direction of the political spectrum. The vaccine hesitancy is fueled by misinformation originating from websites with already questionable credibility. ConclusionsThe vaccine-related misinformation on social media may exacerbate the levels of vaccine hesitancy, hampering progress toward vaccine-induced herd immunity, and could potentially increase the number of infections related to new COVID-19 variants. For these reasons, understanding vaccine hesitancy through the lens of social media is of paramount importance. Because data access is the first obstacle to attain this goal, we published a data set that can be used in studying antivaccine misinformation on social media and enable a better understanding of vaccine hesitancy.https://publichealth.jmir.org/2021/11/e30642
spellingShingle Goran Muric
Yusong Wu
Emilio Ferrara
COVID-19 Vaccine Hesitancy on Social Media: Building a Public Twitter Data Set of Antivaccine Content, Vaccine Misinformation, and Conspiracies
JMIR Public Health and Surveillance
title COVID-19 Vaccine Hesitancy on Social Media: Building a Public Twitter Data Set of Antivaccine Content, Vaccine Misinformation, and Conspiracies
title_full COVID-19 Vaccine Hesitancy on Social Media: Building a Public Twitter Data Set of Antivaccine Content, Vaccine Misinformation, and Conspiracies
title_fullStr COVID-19 Vaccine Hesitancy on Social Media: Building a Public Twitter Data Set of Antivaccine Content, Vaccine Misinformation, and Conspiracies
title_full_unstemmed COVID-19 Vaccine Hesitancy on Social Media: Building a Public Twitter Data Set of Antivaccine Content, Vaccine Misinformation, and Conspiracies
title_short COVID-19 Vaccine Hesitancy on Social Media: Building a Public Twitter Data Set of Antivaccine Content, Vaccine Misinformation, and Conspiracies
title_sort covid 19 vaccine hesitancy on social media building a public twitter data set of antivaccine content vaccine misinformation and conspiracies
url https://publichealth.jmir.org/2021/11/e30642
work_keys_str_mv AT goranmuric covid19vaccinehesitancyonsocialmediabuildingapublictwitterdatasetofantivaccinecontentvaccinemisinformationandconspiracies
AT yusongwu covid19vaccinehesitancyonsocialmediabuildingapublictwitterdatasetofantivaccinecontentvaccinemisinformationandconspiracies
AT emilioferrara covid19vaccinehesitancyonsocialmediabuildingapublictwitterdatasetofantivaccinecontentvaccinemisinformationandconspiracies