Machine Learning Approaches to Traffic Accident Analysis and Hotspot Prediction

Traffic accidents are one of the most important concerns of the world, since they result in numerous casualties, injuries, and fatalities each year, as well as significant economic losses. There are many factors that are responsible for causing road accidents. If these factors can be better understo...

Full description

Bibliographic Details
Main Authors: Daniel Santos, José Saias, Paulo Quaresma, Vítor Beires Nogueira
Format: Article
Language:English
Published: MDPI AG 2021-11-01
Series:Computers
Subjects:
Online Access:https://www.mdpi.com/2073-431X/10/12/157
_version_ 1797505728603226112
author Daniel Santos
José Saias
Paulo Quaresma
Vítor Beires Nogueira
author_facet Daniel Santos
José Saias
Paulo Quaresma
Vítor Beires Nogueira
author_sort Daniel Santos
collection DOAJ
description Traffic accidents are one of the most important concerns of the world, since they result in numerous casualties, injuries, and fatalities each year, as well as significant economic losses. There are many factors that are responsible for causing road accidents. If these factors can be better understood and predicted, it might be possible to take measures to mitigate the damages and its severity. The purpose of this work is to identify these factors using accident data from 2016 to 2019 from the district of Setúbal, Portugal. This work aims at developing models that can select a set of influential factors that may be used to classify the severity of an accident, supporting an analysis on the accident data. In addition, this study also proposes a predictive model for future road accidents based on past data. Various machine learning approaches are used to create these models. Supervised machine learning methods such as decision trees (DT), random forests (RF), logistic regression (LR), and naive Bayes (NB) are used, as well as unsupervised machine learning techniques including DBSCAN and hierarchical clustering. Results show that a rule-based model using the C5.0 algorithm is capable of accurately detecting the most relevant factors describing a road accident severity. Further, the results of the predictive model suggests the RF model could be a useful tool for forecasting accident hotspots.
first_indexed 2024-03-10T04:22:50Z
format Article
id doaj.art-5ebd57ff8cf44fd6b51a90dfefc9f8e8
institution Directory Open Access Journal
issn 2073-431X
language English
last_indexed 2024-03-10T04:22:50Z
publishDate 2021-11-01
publisher MDPI AG
record_format Article
series Computers
spelling doaj.art-5ebd57ff8cf44fd6b51a90dfefc9f8e82023-11-23T07:46:36ZengMDPI AGComputers2073-431X2021-11-01101215710.3390/computers10120157Machine Learning Approaches to Traffic Accident Analysis and Hotspot PredictionDaniel Santos0José Saias1Paulo Quaresma2Vítor Beires Nogueira3Informatics Departament, University of Évora, 7002-554 Évora, PortugalInformatics Departament, University of Évora, 7002-554 Évora, PortugalInformatics Departament, University of Évora, 7002-554 Évora, PortugalInformatics Departament, University of Évora, 7002-554 Évora, PortugalTraffic accidents are one of the most important concerns of the world, since they result in numerous casualties, injuries, and fatalities each year, as well as significant economic losses. There are many factors that are responsible for causing road accidents. If these factors can be better understood and predicted, it might be possible to take measures to mitigate the damages and its severity. The purpose of this work is to identify these factors using accident data from 2016 to 2019 from the district of Setúbal, Portugal. This work aims at developing models that can select a set of influential factors that may be used to classify the severity of an accident, supporting an analysis on the accident data. In addition, this study also proposes a predictive model for future road accidents based on past data. Various machine learning approaches are used to create these models. Supervised machine learning methods such as decision trees (DT), random forests (RF), logistic regression (LR), and naive Bayes (NB) are used, as well as unsupervised machine learning techniques including DBSCAN and hierarchical clustering. Results show that a rule-based model using the C5.0 algorithm is capable of accurately detecting the most relevant factors describing a road accident severity. Further, the results of the predictive model suggests the RF model could be a useful tool for forecasting accident hotspots.https://www.mdpi.com/2073-431X/10/12/157machine learningdata analysisroad accident dataclusteringdecision treesrandom forests
spellingShingle Daniel Santos
José Saias
Paulo Quaresma
Vítor Beires Nogueira
Machine Learning Approaches to Traffic Accident Analysis and Hotspot Prediction
Computers
machine learning
data analysis
road accident data
clustering
decision trees
random forests
title Machine Learning Approaches to Traffic Accident Analysis and Hotspot Prediction
title_full Machine Learning Approaches to Traffic Accident Analysis and Hotspot Prediction
title_fullStr Machine Learning Approaches to Traffic Accident Analysis and Hotspot Prediction
title_full_unstemmed Machine Learning Approaches to Traffic Accident Analysis and Hotspot Prediction
title_short Machine Learning Approaches to Traffic Accident Analysis and Hotspot Prediction
title_sort machine learning approaches to traffic accident analysis and hotspot prediction
topic machine learning
data analysis
road accident data
clustering
decision trees
random forests
url https://www.mdpi.com/2073-431X/10/12/157
work_keys_str_mv AT danielsantos machinelearningapproachestotrafficaccidentanalysisandhotspotprediction
AT josesaias machinelearningapproachestotrafficaccidentanalysisandhotspotprediction
AT pauloquaresma machinelearningapproachestotrafficaccidentanalysisandhotspotprediction
AT vitorbeiresnogueira machinelearningapproachestotrafficaccidentanalysisandhotspotprediction