Community-based time segmentation from network snapshots
Abstract Community detection has proved to be extremely successful in a variety of domains. However, most of the algorithms used in practice assume networks are unchanging in time. This assumption is violated for many datasets, resulting in incorrect or misleading communities. Many different algorit...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
SpringerOpen
2019-05-01
|
Series: | Applied Network Science |
Subjects: | |
Online Access: | http://link.springer.com/article/10.1007/s41109-019-0136-1 |
_version_ | 1818200399929147392 |
---|---|
author | Thomas Magelinski Kathleen M. Carley |
author_facet | Thomas Magelinski Kathleen M. Carley |
author_sort | Thomas Magelinski |
collection | DOAJ |
description | Abstract Community detection has proved to be extremely successful in a variety of domains. However, most of the algorithms used in practice assume networks are unchanging in time. This assumption is violated for many datasets, resulting in incorrect or misleading communities. Many different algorithms to rectify this problem have been proposed. Most of them, however, focus on community evolution rather than abrupt changes. The problem of change detection is easier than that of community evolution, and is often sufficient. Here, we propose an algorithm for determining community-based change points from network snapshots. Networks can then be aggregated between change points, and analyzed without violating assumptions. There are three network types that we have defined our algorithm for, each having a case study: static nodesets, semi-static nodesets, and dynamic nodesets. The case studies for these network types are: the Ukrainian Legislature, the Enron email network, and Twitter data from Ukraine. We empirically verify our algorithm in each case study, and compare results to two popular alternatives: Generalized Louvain and GraphScope. We show the impracticality of Generalized Louvain and that our method is less sensitive than GraphScope. Lastly, we use our first two case studies to determine optimal parameters for an anomaly-detection-based streaming method. We then demonstrate that the streaming method was capable of determining events both from data collection errors and from internal network disruptions. |
first_indexed | 2024-12-12T02:37:03Z |
format | Article |
id | doaj.art-b0ad4d4579ce4da4bb113d9b781b4d9c |
institution | Directory Open Access Journal |
issn | 2364-8228 |
language | English |
last_indexed | 2024-12-12T02:37:03Z |
publishDate | 2019-05-01 |
publisher | SpringerOpen |
record_format | Article |
series | Applied Network Science |
spelling | doaj.art-b0ad4d4579ce4da4bb113d9b781b4d9c2022-12-22T00:41:15ZengSpringerOpenApplied Network Science2364-82282019-05-014111910.1007/s41109-019-0136-1Community-based time segmentation from network snapshotsThomas Magelinski0Kathleen M. Carley1CASOS, Institute for Software Research, Carnegie Mellon UniversityCASOS, Institute for Software Research, Carnegie Mellon UniversityAbstract Community detection has proved to be extremely successful in a variety of domains. However, most of the algorithms used in practice assume networks are unchanging in time. This assumption is violated for many datasets, resulting in incorrect or misleading communities. Many different algorithms to rectify this problem have been proposed. Most of them, however, focus on community evolution rather than abrupt changes. The problem of change detection is easier than that of community evolution, and is often sufficient. Here, we propose an algorithm for determining community-based change points from network snapshots. Networks can then be aggregated between change points, and analyzed without violating assumptions. There are three network types that we have defined our algorithm for, each having a case study: static nodesets, semi-static nodesets, and dynamic nodesets. The case studies for these network types are: the Ukrainian Legislature, the Enron email network, and Twitter data from Ukraine. We empirically verify our algorithm in each case study, and compare results to two popular alternatives: Generalized Louvain and GraphScope. We show the impracticality of Generalized Louvain and that our method is less sensitive than GraphScope. Lastly, we use our first two case studies to determine optimal parameters for an anomaly-detection-based streaming method. We then demonstrate that the streaming method was capable of determining events both from data collection errors and from internal network disruptions.http://link.springer.com/article/10.1007/s41109-019-0136-1Dynamic communitiesCommunity detectionTemporal networksStreaming dataVerkhovna radaTwitter networks |
spellingShingle | Thomas Magelinski Kathleen M. Carley Community-based time segmentation from network snapshots Applied Network Science Dynamic communities Community detection Temporal networks Streaming data Verkhovna rada Twitter networks |
title | Community-based time segmentation from network snapshots |
title_full | Community-based time segmentation from network snapshots |
title_fullStr | Community-based time segmentation from network snapshots |
title_full_unstemmed | Community-based time segmentation from network snapshots |
title_short | Community-based time segmentation from network snapshots |
title_sort | community based time segmentation from network snapshots |
topic | Dynamic communities Community detection Temporal networks Streaming data Verkhovna rada Twitter networks |
url | http://link.springer.com/article/10.1007/s41109-019-0136-1 |
work_keys_str_mv | AT thomasmagelinski communitybasedtimesegmentationfromnetworksnapshots AT kathleenmcarley communitybasedtimesegmentationfromnetworksnapshots |