Early detection of promoted campaigns on social media

Abstract Social media expose millions of users every day to information campaigns - some emerging organically from grassroots activity, others sustained by advertising or other coordinated efforts. These campaigns contribute to the shaping of collective opinions. While most information campaigns are...

Full description

Bibliographic Details
Main Authors: Onur Varol, Emilio Ferrara, Filippo Menczer, Alessandro Flammini
Format: Article
Language:English
Published: SpringerOpen 2017-07-01
Series:EPJ Data Science
Subjects:
Online Access:http://link.springer.com/article/10.1140/epjds/s13688-017-0111-y
_version_ 1811231362208759808
author Onur Varol
Emilio Ferrara
Filippo Menczer
Alessandro Flammini
author_facet Onur Varol
Emilio Ferrara
Filippo Menczer
Alessandro Flammini
author_sort Onur Varol
collection DOAJ
description Abstract Social media expose millions of users every day to information campaigns - some emerging organically from grassroots activity, others sustained by advertising or other coordinated efforts. These campaigns contribute to the shaping of collective opinions. While most information campaigns are benign, some may be deployed for nefarious purposes, including terrorist propaganda, political astroturf, and financial market manipulation. It is therefore important to be able to detect whether a meme is being artificially promoted at the very moment it becomes wildly popular. This problem has important social implications and poses numerous technical challenges. As a first step, here we focus on discriminating between trending memes that are either organic or promoted by means of advertisement. The classification is not trivial: ads cause bursts of attention that can be easily mistaken for those of organic trends. We designed a machine learning framework to classify memes that have been labeled as trending on Twitter. After trending, we can rely on a large volume of activity data. Early detection, occurring immediately at trending time, is a more challenging problem due to the minimal volume of activity data that is available prior to trending. Our supervised learning framework exploits hundreds of time-varying features to capture changing network and diffusion patterns, content and sentiment information, timing signals, and user meta-data. We explore different methods for encoding feature time series. Using millions of tweets containing trending hashtags, we achieve 75% AUC score for early detection, increasing to above 95% after trending. We evaluate the robustness of the algorithms by introducing random temporal shifts on the trend time series. Feature selection analysis reveals that content cues provide consistently useful signals; user features are more informative for early detection, while network and timing features are more helpful once more data is available.
first_indexed 2024-04-12T10:44:03Z
format Article
id doaj.art-c78c7c1271934069b92fccd951b18bc1
institution Directory Open Access Journal
issn 2193-1127
language English
last_indexed 2024-04-12T10:44:03Z
publishDate 2017-07-01
publisher SpringerOpen
record_format Article
series EPJ Data Science
spelling doaj.art-c78c7c1271934069b92fccd951b18bc12022-12-22T03:36:31ZengSpringerOpenEPJ Data Science2193-11272017-07-016111910.1140/epjds/s13688-017-0111-yEarly detection of promoted campaigns on social mediaOnur Varol0Emilio Ferrara1Filippo Menczer2Alessandro Flammini3School of Informatics and Computing, Indiana UniversitySchool of Informatics and Computing, Indiana UniversitySchool of Informatics and Computing, Indiana UniversitySchool of Informatics and Computing, Indiana UniversityAbstract Social media expose millions of users every day to information campaigns - some emerging organically from grassroots activity, others sustained by advertising or other coordinated efforts. These campaigns contribute to the shaping of collective opinions. While most information campaigns are benign, some may be deployed for nefarious purposes, including terrorist propaganda, political astroturf, and financial market manipulation. It is therefore important to be able to detect whether a meme is being artificially promoted at the very moment it becomes wildly popular. This problem has important social implications and poses numerous technical challenges. As a first step, here we focus on discriminating between trending memes that are either organic or promoted by means of advertisement. The classification is not trivial: ads cause bursts of attention that can be easily mistaken for those of organic trends. We designed a machine learning framework to classify memes that have been labeled as trending on Twitter. After trending, we can rely on a large volume of activity data. Early detection, occurring immediately at trending time, is a more challenging problem due to the minimal volume of activity data that is available prior to trending. Our supervised learning framework exploits hundreds of time-varying features to capture changing network and diffusion patterns, content and sentiment information, timing signals, and user meta-data. We explore different methods for encoding feature time series. Using millions of tweets containing trending hashtags, we achieve 75% AUC score for early detection, increasing to above 95% after trending. We evaluate the robustness of the algorithms by introducing random temporal shifts on the trend time series. Feature selection analysis reveals that content cues provide consistently useful signals; user features are more informative for early detection, while network and timing features are more helpful once more data is available.http://link.springer.com/article/10.1140/epjds/s13688-017-0111-ysocial mediainformation campaignsadvertisingearly detection
spellingShingle Onur Varol
Emilio Ferrara
Filippo Menczer
Alessandro Flammini
Early detection of promoted campaigns on social media
EPJ Data Science
social media
information campaigns
advertising
early detection
title Early detection of promoted campaigns on social media
title_full Early detection of promoted campaigns on social media
title_fullStr Early detection of promoted campaigns on social media
title_full_unstemmed Early detection of promoted campaigns on social media
title_short Early detection of promoted campaigns on social media
title_sort early detection of promoted campaigns on social media
topic social media
information campaigns
advertising
early detection
url http://link.springer.com/article/10.1140/epjds/s13688-017-0111-y
work_keys_str_mv AT onurvarol earlydetectionofpromotedcampaignsonsocialmedia
AT emilioferrara earlydetectionofpromotedcampaignsonsocialmedia
AT filippomenczer earlydetectionofpromotedcampaignsonsocialmedia
AT alessandroflammini earlydetectionofpromotedcampaignsonsocialmedia