Community-based time segmentation from network snapshots

Abstract Community detection has proved to be extremely successful in a variety of domains. However, most of the algorithms used in practice assume networks are unchanging in time. This assumption is violated for many datasets, resulting in incorrect or misleading communities. Many different algorit...

Full description

Bibliographic Details
Main Authors: Thomas Magelinski, Kathleen M. Carley
Format: Article
Language:English
Published: SpringerOpen 2019-05-01
Series:Applied Network Science
Subjects:
Online Access:http://link.springer.com/article/10.1007/s41109-019-0136-1
_version_ 1818200399929147392
author Thomas Magelinski
Kathleen M. Carley
author_facet Thomas Magelinski
Kathleen M. Carley
author_sort Thomas Magelinski
collection DOAJ
description Abstract Community detection has proved to be extremely successful in a variety of domains. However, most of the algorithms used in practice assume networks are unchanging in time. This assumption is violated for many datasets, resulting in incorrect or misleading communities. Many different algorithms to rectify this problem have been proposed. Most of them, however, focus on community evolution rather than abrupt changes. The problem of change detection is easier than that of community evolution, and is often sufficient. Here, we propose an algorithm for determining community-based change points from network snapshots. Networks can then be aggregated between change points, and analyzed without violating assumptions. There are three network types that we have defined our algorithm for, each having a case study: static nodesets, semi-static nodesets, and dynamic nodesets. The case studies for these network types are: the Ukrainian Legislature, the Enron email network, and Twitter data from Ukraine. We empirically verify our algorithm in each case study, and compare results to two popular alternatives: Generalized Louvain and GraphScope. We show the impracticality of Generalized Louvain and that our method is less sensitive than GraphScope. Lastly, we use our first two case studies to determine optimal parameters for an anomaly-detection-based streaming method. We then demonstrate that the streaming method was capable of determining events both from data collection errors and from internal network disruptions.
first_indexed 2024-12-12T02:37:03Z
format Article
id doaj.art-b0ad4d4579ce4da4bb113d9b781b4d9c
institution Directory Open Access Journal
issn 2364-8228
language English
last_indexed 2024-12-12T02:37:03Z
publishDate 2019-05-01
publisher SpringerOpen
record_format Article
series Applied Network Science
spelling doaj.art-b0ad4d4579ce4da4bb113d9b781b4d9c2022-12-22T00:41:15ZengSpringerOpenApplied Network Science2364-82282019-05-014111910.1007/s41109-019-0136-1Community-based time segmentation from network snapshotsThomas Magelinski0Kathleen M. Carley1CASOS, Institute for Software Research, Carnegie Mellon UniversityCASOS, Institute for Software Research, Carnegie Mellon UniversityAbstract Community detection has proved to be extremely successful in a variety of domains. However, most of the algorithms used in practice assume networks are unchanging in time. This assumption is violated for many datasets, resulting in incorrect or misleading communities. Many different algorithms to rectify this problem have been proposed. Most of them, however, focus on community evolution rather than abrupt changes. The problem of change detection is easier than that of community evolution, and is often sufficient. Here, we propose an algorithm for determining community-based change points from network snapshots. Networks can then be aggregated between change points, and analyzed without violating assumptions. There are three network types that we have defined our algorithm for, each having a case study: static nodesets, semi-static nodesets, and dynamic nodesets. The case studies for these network types are: the Ukrainian Legislature, the Enron email network, and Twitter data from Ukraine. We empirically verify our algorithm in each case study, and compare results to two popular alternatives: Generalized Louvain and GraphScope. We show the impracticality of Generalized Louvain and that our method is less sensitive than GraphScope. Lastly, we use our first two case studies to determine optimal parameters for an anomaly-detection-based streaming method. We then demonstrate that the streaming method was capable of determining events both from data collection errors and from internal network disruptions.http://link.springer.com/article/10.1007/s41109-019-0136-1Dynamic communitiesCommunity detectionTemporal networksStreaming dataVerkhovna radaTwitter networks
spellingShingle Thomas Magelinski
Kathleen M. Carley
Community-based time segmentation from network snapshots
Applied Network Science
Dynamic communities
Community detection
Temporal networks
Streaming data
Verkhovna rada
Twitter networks
title Community-based time segmentation from network snapshots
title_full Community-based time segmentation from network snapshots
title_fullStr Community-based time segmentation from network snapshots
title_full_unstemmed Community-based time segmentation from network snapshots
title_short Community-based time segmentation from network snapshots
title_sort community based time segmentation from network snapshots
topic Dynamic communities
Community detection
Temporal networks
Streaming data
Verkhovna rada
Twitter networks
url http://link.springer.com/article/10.1007/s41109-019-0136-1
work_keys_str_mv AT thomasmagelinski communitybasedtimesegmentationfromnetworksnapshots
AT kathleenmcarley communitybasedtimesegmentationfromnetworksnapshots