scikit-mobility: A Python Library for the Analysis, Generation, and Risk Assessment of Mobility Data

The last decade has witnessed the emergence of massive mobility datasets, such as tracks generated by GPS devices, call detail records, and geo-tagged posts from social media platforms. These datasets have fostered a vast scientific production on various applications of mobility analysis, ranging fr...

Full description

Bibliographic Details
Main Authors: Luca Pappalardo, Filippo Simini, Gianni Barlacchi, Roberto Pellungrini
Format: Article
Language:English
Published: Foundation for Open Access Statistics 2022-07-01
Series:Journal of Statistical Software
Subjects:
Online Access:https://www.jstatsoft.org/index.php/jss/article/view/3942
_version_ 1797813964386598912
author Luca Pappalardo
Filippo Simini
Gianni Barlacchi
Roberto Pellungrini
author_facet Luca Pappalardo
Filippo Simini
Gianni Barlacchi
Roberto Pellungrini
author_sort Luca Pappalardo
collection DOAJ
description The last decade has witnessed the emergence of massive mobility datasets, such as tracks generated by GPS devices, call detail records, and geo-tagged posts from social media platforms. These datasets have fostered a vast scientific production on various applications of mobility analysis, ranging from computational epidemiology to urban planning and transportation engineering. A strand of literature addresses data cleaning issues related to raw spatiotemporal trajectories, while the second line of research focuses on discovering the statistical "laws" that govern human movements. A significant effort has also been put on designing algorithms to generate synthetic trajectories able to reproduce, realistically, the laws of human mobility. Last but not least, a line of research addresses the crucial problem of privacy, proposing techniques to perform the re-identification of individuals in a database. A view on state-of-the-art cannot avoid noticing that there is no statistical software that can support scientists and practitioners with all the aspects mentioned above of mobility data analysis. In this paper, we propose scikit-mobility, a Python library that has the ambition of providing an environment to reproduce existing research, analyze mobility data, and simulate human mobility habits. scikit-mobility is efficient and easy to use as it extends pandas, a popular Python library for data analysis. Moreover, scikit-mobility provides the user with many functionalities, from visualizing trajectories to generating synthetic data, from analyzing statistical patterns to assessing the privacy risk related to the analysis of mobility datasets.
first_indexed 2024-03-13T08:00:30Z
format Article
id doaj.art-cde8ae0756cb4107b29e5efd1224bfd5
institution Directory Open Access Journal
issn 1548-7660
language English
last_indexed 2024-03-13T08:00:30Z
publishDate 2022-07-01
publisher Foundation for Open Access Statistics
record_format Article
series Journal of Statistical Software
spelling doaj.art-cde8ae0756cb4107b29e5efd1224bfd52023-06-01T18:48:04ZengFoundation for Open Access StatisticsJournal of Statistical Software1548-76602022-07-0110313810.18637/jss.v103.i043750scikit-mobility: A Python Library for the Analysis, Generation, and Risk Assessment of Mobility DataLuca Pappalardo0https://orcid.org/0000-0002-1547-6007Filippo Simini1https://orcid.org/0000-0001-8675-3529Gianni Barlacchi2https://orcid.org/0000-0002-9896-0610Roberto Pellungrini3https://orcid.org/0000-0003-3268-9271ISTI-CNRArgonne National LaboratoryAmazon, Alexa AIUniversity of PisaThe last decade has witnessed the emergence of massive mobility datasets, such as tracks generated by GPS devices, call detail records, and geo-tagged posts from social media platforms. These datasets have fostered a vast scientific production on various applications of mobility analysis, ranging from computational epidemiology to urban planning and transportation engineering. A strand of literature addresses data cleaning issues related to raw spatiotemporal trajectories, while the second line of research focuses on discovering the statistical "laws" that govern human movements. A significant effort has also been put on designing algorithms to generate synthetic trajectories able to reproduce, realistically, the laws of human mobility. Last but not least, a line of research addresses the crucial problem of privacy, proposing techniques to perform the re-identification of individuals in a database. A view on state-of-the-art cannot avoid noticing that there is no statistical software that can support scientists and practitioners with all the aspects mentioned above of mobility data analysis. In this paper, we propose scikit-mobility, a Python library that has the ambition of providing an environment to reproduce existing research, analyze mobility data, and simulate human mobility habits. scikit-mobility is efficient and easy to use as it extends pandas, a popular Python library for data analysis. Moreover, scikit-mobility provides the user with many functionalities, from visualizing trajectories to generating synthetic data, from analyzing statistical patterns to assessing the privacy risk related to the analysis of mobility datasets.https://www.jstatsoft.org/index.php/jss/article/view/3942data sciencehuman mobilitybig datanetwork sciencedata miningpythonmathematical modellingmigration modelsprivacysoftwareopen source
spellingShingle Luca Pappalardo
Filippo Simini
Gianni Barlacchi
Roberto Pellungrini
scikit-mobility: A Python Library for the Analysis, Generation, and Risk Assessment of Mobility Data
Journal of Statistical Software
data science
human mobility
big data
network science
data mining
python
mathematical modelling
migration models
privacy
software
open source
title scikit-mobility: A Python Library for the Analysis, Generation, and Risk Assessment of Mobility Data
title_full scikit-mobility: A Python Library for the Analysis, Generation, and Risk Assessment of Mobility Data
title_fullStr scikit-mobility: A Python Library for the Analysis, Generation, and Risk Assessment of Mobility Data
title_full_unstemmed scikit-mobility: A Python Library for the Analysis, Generation, and Risk Assessment of Mobility Data
title_short scikit-mobility: A Python Library for the Analysis, Generation, and Risk Assessment of Mobility Data
title_sort scikit mobility a python library for the analysis generation and risk assessment of mobility data
topic data science
human mobility
big data
network science
data mining
python
mathematical modelling
migration models
privacy
software
open source
url https://www.jstatsoft.org/index.php/jss/article/view/3942
work_keys_str_mv AT lucapappalardo scikitmobilityapythonlibraryfortheanalysisgenerationandriskassessmentofmobilitydata
AT filipposimini scikitmobilityapythonlibraryfortheanalysisgenerationandriskassessmentofmobilitydata
AT giannibarlacchi scikitmobilityapythonlibraryfortheanalysisgenerationandriskassessmentofmobilitydata
AT robertopellungrini scikitmobilityapythonlibraryfortheanalysisgenerationandriskassessmentofmobilitydata