Accident severity prediction modeling for road safety using random forest algorithm: an analysis of Indian highways [version 2; peer review: 1 approved, 2 approved with reservations]

Background: Road accidents claim around 1.35 million lives annually, with countries like India facing a significant impact. In 2019, India reported 449,002 road accidents, causing 151,113 deaths and 451,361 injuries. Accident severity modeling helps understand contributing factors and develop preven...

Full description

Bibliographic Details
Main Authors: Humera Khanum, Mir Iqbal Faheem, Anshul Garg
Format: Article
Language:English
Published: F1000 Research Ltd 2023-10-01
Series:F1000Research
Subjects:
Online Access:https://f1000research.com/articles/12-494/v2
_version_ 1797356193900920832
author Humera Khanum
Mir Iqbal Faheem
Anshul Garg
author_facet Humera Khanum
Mir Iqbal Faheem
Anshul Garg
author_sort Humera Khanum
collection DOAJ
description Background: Road accidents claim around 1.35 million lives annually, with countries like India facing a significant impact. In 2019, India reported 449,002 road accidents, causing 151,113 deaths and 451,361 injuries. Accident severity modeling helps understand contributing factors and develop preventive strategies. AI models, such as random forest, offer adaptability and higher predictive accuracy compared to traditional statistical models. This study aims to develop a predictive model for traffic accident severity on Indian highways using the random forest algorithm. Methods: A multi-step methodology was employed, involving data collection and preparation, feature selection, training a random forest model, tuning parameters, and evaluating the model using accuracy and F1 score. Data sources included MoRTH and NHAI. Results: The classification model had hyperparameters ‘max depth’:  10, ‘max features’: ‘sqrt’, and ‘n estimators’: 100. The model achieved an overall accuracy of 67% and a weighted average F1-score of 0.64 on the training set, with a macro average F1-score of 0.53. Using grid search, a random forest Classifier was fitted with optimal parameters, resulting in 41.47% accuracy on test data. Conclusions: The random forest classifier model predicted traffic accident severity with 67% accuracy on the training set and 41.47% on the test set, suggesting possible bias or imbalance in the dataset. No clear patterns were found between the day of the week and accident occurrence or severity. Performance can be improved by addressing dataset imbalance and refining model hyperparameters. The model often underestimated accident severity, highlighting the influence of external factors. Adopting a sophisticated data recording system in line with MoRTH and IRC guidelines and integrating machine learning techniques can enhance road safety modeling, decision-making, and accident prevention efforts.
first_indexed 2024-03-08T14:23:02Z
format Article
id doaj.art-41ea6ead658248a48b99021442763279
institution Directory Open Access Journal
issn 2046-1402
language English
last_indexed 2024-03-08T14:23:02Z
publishDate 2023-10-01
publisher F1000 Research Ltd
record_format Article
series F1000Research
spelling doaj.art-41ea6ead658248a48b990214427632792024-01-14T01:00:02ZengF1000 Research LtdF1000Research2046-14022023-10-0112157426Accident severity prediction modeling for road safety using random forest algorithm: an analysis of Indian highways [version 2; peer review: 1 approved, 2 approved with reservations]Humera Khanum0https://orcid.org/0000-0003-2689-6370Mir Iqbal Faheem1Anshul Garg2School of Civil Engineering, Lovely Professional University, Phagwara, Punjab, 1444411, IndiaCivil Engineering Department, Deccan College of Engineering and Technology, Hyderabad, Telangana, 500001, IndiaSchool of Civil Engineering, Lovely Professional University, Phagwara, Punjab, 1444411, IndiaBackground: Road accidents claim around 1.35 million lives annually, with countries like India facing a significant impact. In 2019, India reported 449,002 road accidents, causing 151,113 deaths and 451,361 injuries. Accident severity modeling helps understand contributing factors and develop preventive strategies. AI models, such as random forest, offer adaptability and higher predictive accuracy compared to traditional statistical models. This study aims to develop a predictive model for traffic accident severity on Indian highways using the random forest algorithm. Methods: A multi-step methodology was employed, involving data collection and preparation, feature selection, training a random forest model, tuning parameters, and evaluating the model using accuracy and F1 score. Data sources included MoRTH and NHAI. Results: The classification model had hyperparameters ‘max depth’:  10, ‘max features’: ‘sqrt’, and ‘n estimators’: 100. The model achieved an overall accuracy of 67% and a weighted average F1-score of 0.64 on the training set, with a macro average F1-score of 0.53. Using grid search, a random forest Classifier was fitted with optimal parameters, resulting in 41.47% accuracy on test data. Conclusions: The random forest classifier model predicted traffic accident severity with 67% accuracy on the training set and 41.47% on the test set, suggesting possible bias or imbalance in the dataset. No clear patterns were found between the day of the week and accident occurrence or severity. Performance can be improved by addressing dataset imbalance and refining model hyperparameters. The model often underestimated accident severity, highlighting the influence of external factors. Adopting a sophisticated data recording system in line with MoRTH and IRC guidelines and integrating machine learning techniques can enhance road safety modeling, decision-making, and accident prevention efforts.https://f1000research.com/articles/12-494/v2Traffic Accidents Accident Severity Road Safety Accident Prediction Modeling Random Forest eng
spellingShingle Humera Khanum
Mir Iqbal Faheem
Anshul Garg
Accident severity prediction modeling for road safety using random forest algorithm: an analysis of Indian highways [version 2; peer review: 1 approved, 2 approved with reservations]
F1000Research
Traffic Accidents
Accident Severity
Road Safety
Accident Prediction Modeling
Random Forest
eng
title Accident severity prediction modeling for road safety using random forest algorithm: an analysis of Indian highways [version 2; peer review: 1 approved, 2 approved with reservations]
title_full Accident severity prediction modeling for road safety using random forest algorithm: an analysis of Indian highways [version 2; peer review: 1 approved, 2 approved with reservations]
title_fullStr Accident severity prediction modeling for road safety using random forest algorithm: an analysis of Indian highways [version 2; peer review: 1 approved, 2 approved with reservations]
title_full_unstemmed Accident severity prediction modeling for road safety using random forest algorithm: an analysis of Indian highways [version 2; peer review: 1 approved, 2 approved with reservations]
title_short Accident severity prediction modeling for road safety using random forest algorithm: an analysis of Indian highways [version 2; peer review: 1 approved, 2 approved with reservations]
title_sort accident severity prediction modeling for road safety using random forest algorithm an analysis of indian highways version 2 peer review 1 approved 2 approved with reservations
topic Traffic Accidents
Accident Severity
Road Safety
Accident Prediction Modeling
Random Forest
eng
url https://f1000research.com/articles/12-494/v2
work_keys_str_mv AT humerakhanum accidentseveritypredictionmodelingforroadsafetyusingrandomforestalgorithmananalysisofindianhighwaysversion2peerreview1approved2approvedwithreservations
AT miriqbalfaheem accidentseveritypredictionmodelingforroadsafetyusingrandomforestalgorithmananalysisofindianhighwaysversion2peerreview1approved2approvedwithreservations
AT anshulgarg accidentseveritypredictionmodelingforroadsafetyusingrandomforestalgorithmananalysisofindianhighwaysversion2peerreview1approved2approvedwithreservations