Merging Datasets of CyberSecurity Incidents for Fun and Insight

Providing an adequate assessment of their cyber-security posture requires companies and organisations to collect information about threats from a wide range of sources. One of such sources is history, intended as the knowledge about past cyber-security incidents, their size, type of attacks, industr...

Full description

Bibliographic Details
Main Authors: Giovanni Abbiati, Silvio Ranise, Antonio Schizzerotto, Alberto Siena
Format: Article
Language:English
Published: Frontiers Media S.A. 2021-01-01
Series:Frontiers in Big Data
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fdata.2020.521132/full
_version_ 1819283190928375808
author Giovanni Abbiati
Giovanni Abbiati
Silvio Ranise
Silvio Ranise
Antonio Schizzerotto
Antonio Schizzerotto
Alberto Siena
author_facet Giovanni Abbiati
Giovanni Abbiati
Silvio Ranise
Silvio Ranise
Antonio Schizzerotto
Antonio Schizzerotto
Alberto Siena
author_sort Giovanni Abbiati
collection DOAJ
description Providing an adequate assessment of their cyber-security posture requires companies and organisations to collect information about threats from a wide range of sources. One of such sources is history, intended as the knowledge about past cyber-security incidents, their size, type of attacks, industry sector and so on. Ideally, having a large enough dataset of past security incidents, it would be possible to analyze it with automated tools and draw conclusions that may help in preventing future incidents. Unfortunately, it seems that there are only a few publicly available datasets of this kind that are of good quality. The paper reports our initial efforts in collecting all publicly available security incidents datasets, and building a single, large dataset that can be used to draw statistically significant observations. In order to argue about its statistical quality, we analyze the resulting combined dataset against the original ones. Additionally, we perform an analysis of the combined dataset and compare our results with the existing literature. Finally, we present our findings, discuss the limitations of the proposed approach, and point out interesting research directions.
first_indexed 2024-12-24T01:27:33Z
format Article
id doaj.art-c58eca96e3a8412895a1d03a8a748e00
institution Directory Open Access Journal
issn 2624-909X
language English
last_indexed 2024-12-24T01:27:33Z
publishDate 2021-01-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Big Data
spelling doaj.art-c58eca96e3a8412895a1d03a8a748e002022-12-21T17:22:28ZengFrontiers Media S.A.Frontiers in Big Data2624-909X2021-01-01310.3389/fdata.2020.521132521132Merging Datasets of CyberSecurity Incidents for Fun and InsightGiovanni Abbiati0Giovanni Abbiati1Silvio Ranise2Silvio Ranise3Antonio Schizzerotto4Antonio Schizzerotto5Alberto Siena6Department of Social and Political Sciences, University of Milan, Milan, ItalyFondazione Bruno Kessler, Trento, ItalyFondazione Bruno Kessler, Trento, ItalyDepartment of Mathematics, University of Trento, Trento, ItalyFondazione Bruno Kessler, Trento, ItalyDepartment of Mathematics, University of Trento, Trento, ItalyFondazione Bruno Kessler, Trento, ItalyProviding an adequate assessment of their cyber-security posture requires companies and organisations to collect information about threats from a wide range of sources. One of such sources is history, intended as the knowledge about past cyber-security incidents, their size, type of attacks, industry sector and so on. Ideally, having a large enough dataset of past security incidents, it would be possible to analyze it with automated tools and draw conclusions that may help in preventing future incidents. Unfortunately, it seems that there are only a few publicly available datasets of this kind that are of good quality. The paper reports our initial efforts in collecting all publicly available security incidents datasets, and building a single, large dataset that can be used to draw statistically significant observations. In order to argue about its statistical quality, we analyze the resulting combined dataset against the original ones. Additionally, we perform an analysis of the combined dataset and compare our results with the existing literature. Finally, we present our findings, discuss the limitations of the proposed approach, and point out interesting research directions.https://www.frontiersin.org/articles/10.3389/fdata.2020.521132/fullcyber securitydata analysissecurity incidents statisticsmethodological frameworkdata breaches
spellingShingle Giovanni Abbiati
Giovanni Abbiati
Silvio Ranise
Silvio Ranise
Antonio Schizzerotto
Antonio Schizzerotto
Alberto Siena
Merging Datasets of CyberSecurity Incidents for Fun and Insight
Frontiers in Big Data
cyber security
data analysis
security incidents statistics
methodological framework
data breaches
title Merging Datasets of CyberSecurity Incidents for Fun and Insight
title_full Merging Datasets of CyberSecurity Incidents for Fun and Insight
title_fullStr Merging Datasets of CyberSecurity Incidents for Fun and Insight
title_full_unstemmed Merging Datasets of CyberSecurity Incidents for Fun and Insight
title_short Merging Datasets of CyberSecurity Incidents for Fun and Insight
title_sort merging datasets of cybersecurity incidents for fun and insight
topic cyber security
data analysis
security incidents statistics
methodological framework
data breaches
url https://www.frontiersin.org/articles/10.3389/fdata.2020.521132/full
work_keys_str_mv AT giovanniabbiati mergingdatasetsofcybersecurityincidentsforfunandinsight
AT giovanniabbiati mergingdatasetsofcybersecurityincidentsforfunandinsight
AT silvioranise mergingdatasetsofcybersecurityincidentsforfunandinsight
AT silvioranise mergingdatasetsofcybersecurityincidentsforfunandinsight
AT antonioschizzerotto mergingdatasetsofcybersecurityincidentsforfunandinsight
AT antonioschizzerotto mergingdatasetsofcybersecurityincidentsforfunandinsight
AT albertosiena mergingdatasetsofcybersecurityincidentsforfunandinsight