Reddit CrosspostNet—Studying Reddit Communities with Large-Scale Crosspost Graph Networks

As the largest open social medium on the Internet, Reddit is widely studied in the scientific literature. Due to its structured form and division into topical subfora (subreddits), conducted research often concerns connections and interactions between users and/or whole, subreddit-structure-based co...

Full description

Bibliographic Details
Main Authors: Jan Sawicki, Maria Ganzha, Marcin Paprzycki, Yutaka Watanobe
Format: Article
Language:English
Published: MDPI AG 2023-09-01
Series:Algorithms
Subjects:
Online Access:https://www.mdpi.com/1999-4893/16/9/424
_version_ 1827727616578158592
author Jan Sawicki
Maria Ganzha
Marcin Paprzycki
Yutaka Watanobe
author_facet Jan Sawicki
Maria Ganzha
Marcin Paprzycki
Yutaka Watanobe
author_sort Jan Sawicki
collection DOAJ
description As the largest open social medium on the Internet, Reddit is widely studied in the scientific literature. Due to its structured form and division into topical subfora (subreddits), conducted research often concerns connections and interactions between users and/or whole, subreddit-structure-based communities. Overall, the relations between communities are most often studied by applying graph networks, with various creation algorithms. In this work, a novel approach is proposed to build and understand the structure of Reddit. It is based on crossposts—posts that appeared on one subreddit and then were crossposted to another. After capturing one year of crossposts, a directed weighted graph network, using seven million posts from over 10,000 of the most popular subreddits, has been created. Using graph network algorithms, its characteristics are captured and compared to similar studies. We identify the information “sinks” and “sources”—the most active crossposting subreddits. Moreover, we obtained graph network metrics: the degree (modeled with the Power Law), clustering, community detection algorithms, and connected components structure network are compared to previous studies on Reddit network(s), yielding consistent, but also novel results. Finally, the relations between extensively studied subreddits (e.g., r/AITA, r/Parenting, r/politics) and new ones, which were not accounted for in previous research, opening new paths for data-driven studies, are summarized.
first_indexed 2024-03-10T23:07:53Z
format Article
id doaj.art-cd5984a72ffb4011b499cbd077a66f6e
institution Directory Open Access Journal
issn 1999-4893
language English
last_indexed 2024-03-10T23:07:53Z
publishDate 2023-09-01
publisher MDPI AG
record_format Article
series Algorithms
spelling doaj.art-cd5984a72ffb4011b499cbd077a66f6e2023-11-19T09:12:56ZengMDPI AGAlgorithms1999-48932023-09-0116942410.3390/a16090424Reddit CrosspostNet—Studying Reddit Communities with Large-Scale Crosspost Graph NetworksJan Sawicki0Maria Ganzha1Marcin Paprzycki2Yutaka Watanobe3Faculty of Mathematics and Information Science, Warsaw University of Technology, 00-662 Warsaw, PolandFaculty of Mathematics and Information Science, Warsaw University of Technology, 00-662 Warsaw, PolandSystems Research Institute, Polish Academy of Sciences, 01-447 Warsaw, PolandDepartment of Computer Science and Engineering, The University of Aizu, Aizu-Wakamatsu 965-8580, JapanAs the largest open social medium on the Internet, Reddit is widely studied in the scientific literature. Due to its structured form and division into topical subfora (subreddits), conducted research often concerns connections and interactions between users and/or whole, subreddit-structure-based communities. Overall, the relations between communities are most often studied by applying graph networks, with various creation algorithms. In this work, a novel approach is proposed to build and understand the structure of Reddit. It is based on crossposts—posts that appeared on one subreddit and then were crossposted to another. After capturing one year of crossposts, a directed weighted graph network, using seven million posts from over 10,000 of the most popular subreddits, has been created. Using graph network algorithms, its characteristics are captured and compared to similar studies. We identify the information “sinks” and “sources”—the most active crossposting subreddits. Moreover, we obtained graph network metrics: the degree (modeled with the Power Law), clustering, community detection algorithms, and connected components structure network are compared to previous studies on Reddit network(s), yielding consistent, but also novel results. Finally, the relations between extensively studied subreddits (e.g., r/AITA, r/Parenting, r/politics) and new ones, which were not accounted for in previous research, opening new paths for data-driven studies, are summarized.https://www.mdpi.com/1999-4893/16/9/424graph network-based analysisRedditsubredditsonline social networksbig datacrossposts
spellingShingle Jan Sawicki
Maria Ganzha
Marcin Paprzycki
Yutaka Watanobe
Reddit CrosspostNet—Studying Reddit Communities with Large-Scale Crosspost Graph Networks
Algorithms
graph network-based analysis
Reddit
subreddits
online social networks
big data
crossposts
title Reddit CrosspostNet—Studying Reddit Communities with Large-Scale Crosspost Graph Networks
title_full Reddit CrosspostNet—Studying Reddit Communities with Large-Scale Crosspost Graph Networks
title_fullStr Reddit CrosspostNet—Studying Reddit Communities with Large-Scale Crosspost Graph Networks
title_full_unstemmed Reddit CrosspostNet—Studying Reddit Communities with Large-Scale Crosspost Graph Networks
title_short Reddit CrosspostNet—Studying Reddit Communities with Large-Scale Crosspost Graph Networks
title_sort reddit crosspostnet studying reddit communities with large scale crosspost graph networks
topic graph network-based analysis
Reddit
subreddits
online social networks
big data
crossposts
url https://www.mdpi.com/1999-4893/16/9/424
work_keys_str_mv AT jansawicki redditcrosspostnetstudyingredditcommunitieswithlargescalecrosspostgraphnetworks
AT mariaganzha redditcrosspostnetstudyingredditcommunitieswithlargescalecrosspostgraphnetworks
AT marcinpaprzycki redditcrosspostnetstudyingredditcommunitieswithlargescalecrosspostgraphnetworks
AT yutakawatanobe redditcrosspostnetstudyingredditcommunitieswithlargescalecrosspostgraphnetworks