Quality assessment and community detection methods for anonymized mobility data in the Italian Covid context

Abstract We discuss how to assess the reliability of partial, anonymized mobility data and compare two different methods to identify spatial communities based on movements: Greedy Modularity Clustering (GMC) and the novel Critical Variable Selection (CVS). These capture different aspects of mobility...

Full description

Bibliographic Details
Main Authors: Jules Morand, Shoichi Yip, Yannis Velegrakis, Gianluca Lattanzi, Raffaello Potestio, Luca Tubiana
Format: Article
Language:English
Published: Nature Portfolio 2024-02-01
Series:Scientific Reports
Online Access:https://doi.org/10.1038/s41598-024-54878-0
_version_ 1797275095137255424
author Jules Morand
Shoichi Yip
Yannis Velegrakis
Gianluca Lattanzi
Raffaello Potestio
Luca Tubiana
author_facet Jules Morand
Shoichi Yip
Yannis Velegrakis
Gianluca Lattanzi
Raffaello Potestio
Luca Tubiana
author_sort Jules Morand
collection DOAJ
description Abstract We discuss how to assess the reliability of partial, anonymized mobility data and compare two different methods to identify spatial communities based on movements: Greedy Modularity Clustering (GMC) and the novel Critical Variable Selection (CVS). These capture different aspects of mobility: direct population fluxes (GMC) and the probability for individuals to move between two nodes (CVS). As a test case, we consider movements of Italians before and during the SARS-Cov2 pandemic, using Facebook users’ data and publicly available information from the Italian National Institute of Statistics (Istat) to construct daily mobility networks at the interprovincial level. Using the Perron-Frobenius (PF) theorem, we show how the mean stochastic network has a stationary population density state comparable with data from Istat, and how this ceases to be the case if even a moderate amount of pruning is applied to the network. We then identify the first two national lockdowns through temporal clustering of the mobility networks, define two representative graphs for the lockdown and non-lockdown conditions and perform optimal spatial community identification on both graphs using the GMC and CVS approaches. Despite the fundamental differences in the methods, the variation of information (VI) between them assesses that they return similar partitions of the Italian provincial networks in both situations. The information provided can be used to inform policy, for example, to define an optimal scale for lockdown measures. Our approach is general and can be applied to other countries or geographical scales.
first_indexed 2024-03-07T15:09:26Z
format Article
id doaj.art-bc9cdc34bc7a45e0805ecac8f007d491
institution Directory Open Access Journal
issn 2045-2322
language English
last_indexed 2024-03-07T15:09:26Z
publishDate 2024-02-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj.art-bc9cdc34bc7a45e0805ecac8f007d4912024-03-05T18:46:44ZengNature PortfolioScientific Reports2045-23222024-02-0114111310.1038/s41598-024-54878-0Quality assessment and community detection methods for anonymized mobility data in the Italian Covid contextJules Morand0Shoichi Yip1Yannis Velegrakis2Gianluca Lattanzi3Raffaello Potestio4Luca Tubiana5University of TrentoUniversity of TrentoUniversity of TrentoUniversity of TrentoUniversity of TrentoUniversity of TrentoAbstract We discuss how to assess the reliability of partial, anonymized mobility data and compare two different methods to identify spatial communities based on movements: Greedy Modularity Clustering (GMC) and the novel Critical Variable Selection (CVS). These capture different aspects of mobility: direct population fluxes (GMC) and the probability for individuals to move between two nodes (CVS). As a test case, we consider movements of Italians before and during the SARS-Cov2 pandemic, using Facebook users’ data and publicly available information from the Italian National Institute of Statistics (Istat) to construct daily mobility networks at the interprovincial level. Using the Perron-Frobenius (PF) theorem, we show how the mean stochastic network has a stationary population density state comparable with data from Istat, and how this ceases to be the case if even a moderate amount of pruning is applied to the network. We then identify the first two national lockdowns through temporal clustering of the mobility networks, define two representative graphs for the lockdown and non-lockdown conditions and perform optimal spatial community identification on both graphs using the GMC and CVS approaches. Despite the fundamental differences in the methods, the variation of information (VI) between them assesses that they return similar partitions of the Italian provincial networks in both situations. The information provided can be used to inform policy, for example, to define an optimal scale for lockdown measures. Our approach is general and can be applied to other countries or geographical scales.https://doi.org/10.1038/s41598-024-54878-0
spellingShingle Jules Morand
Shoichi Yip
Yannis Velegrakis
Gianluca Lattanzi
Raffaello Potestio
Luca Tubiana
Quality assessment and community detection methods for anonymized mobility data in the Italian Covid context
Scientific Reports
title Quality assessment and community detection methods for anonymized mobility data in the Italian Covid context
title_full Quality assessment and community detection methods for anonymized mobility data in the Italian Covid context
title_fullStr Quality assessment and community detection methods for anonymized mobility data in the Italian Covid context
title_full_unstemmed Quality assessment and community detection methods for anonymized mobility data in the Italian Covid context
title_short Quality assessment and community detection methods for anonymized mobility data in the Italian Covid context
title_sort quality assessment and community detection methods for anonymized mobility data in the italian covid context
url https://doi.org/10.1038/s41598-024-54878-0
work_keys_str_mv AT julesmorand qualityassessmentandcommunitydetectionmethodsforanonymizedmobilitydataintheitaliancovidcontext
AT shoichiyip qualityassessmentandcommunitydetectionmethodsforanonymizedmobilitydataintheitaliancovidcontext
AT yannisvelegrakis qualityassessmentandcommunitydetectionmethodsforanonymizedmobilitydataintheitaliancovidcontext
AT gianlucalattanzi qualityassessmentandcommunitydetectionmethodsforanonymizedmobilitydataintheitaliancovidcontext
AT raffaellopotestio qualityassessmentandcommunitydetectionmethodsforanonymizedmobilitydataintheitaliancovidcontext
AT lucatubiana qualityassessmentandcommunitydetectionmethodsforanonymizedmobilitydataintheitaliancovidcontext