An accident prediction model based on ARIMA in Kuala Lumpur, Malaysia using time series of actual accidents and related data
Recently, there has been an emerging trend to analyse time series data and utilise sophisticated tools for optimally fitting time series models. To date, Malaysian industrial accident data was underutilised and lack of informative record. Thus, this paper aims to investigate the Malaysian accident d...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Universiti Putra Malaysia Press
2023
|
Online Access: | http://psasir.upm.edu.my/id/eprint/106503/1/07%20JST-4635-2023.pdf |
_version_ | 1825939373640646656 |
---|---|
author | Choo, Boon Chong Abdul Razak, Musab Mohd Tohir, Mohd Zahirasri Awang Biak, Dayang Radiah Syam, Syafiie |
author_facet | Choo, Boon Chong Abdul Razak, Musab Mohd Tohir, Mohd Zahirasri Awang Biak, Dayang Radiah Syam, Syafiie |
author_sort | Choo, Boon Chong |
collection | UPM |
description | Recently, there has been an emerging trend to analyse time series data and utilise sophisticated tools for optimally fitting time series models. To date, Malaysian industrial accident data was underutilised and lack of informative record. Thus, this paper aims to investigate the Malaysian accident database and further evaluate the optimal forecasting models in accident prediction. The model's input was based on available data from the Department of Occupational Safety and Health, Malaysia (DOSH) from 2018 until 2021, with 80 of the dataset to train the models and the remaining 20 for validation. The prediction using negative binomial and Poisson distribution showed a mean absolute percentage error (MAPE) of 33 and 51, respectively. This indicated that the negative binomial performed better than the Poisson distribution in accident frequency prediction. The available time series accident data were gathered for four years and stationarity was checked in R Studio software for the Augmented Dickey-Fuller test. The lowest Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC) and other error values were used to justify the best model, which was ARIMA(2,0,2)(2,0,0)(12) model. The ARIMA models were considered after the data showed autocorrelation. The MAPE for both ARIMA in R and manual time series were 40 and 49, respectively. Therefore, the accident prediction by using R Studio would outperform the manually negative binomial and Poisson distribution. Based on the findings, industrial safety practitioners should report accidents to DOSH truthfully in the era of digitalisation. This could enable future data-driven accident predictions to be carried out. |
first_indexed | 2024-12-09T02:17:37Z |
format | Article |
id | upm.eprints-106503 |
institution | Universiti Putra Malaysia |
language | English |
last_indexed | 2024-12-09T02:17:37Z |
publishDate | 2023 |
publisher | Universiti Putra Malaysia Press |
record_format | dspace |
spelling | upm.eprints-1065032024-09-26T07:57:10Z http://psasir.upm.edu.my/id/eprint/106503/ An accident prediction model based on ARIMA in Kuala Lumpur, Malaysia using time series of actual accidents and related data Choo, Boon Chong Abdul Razak, Musab Mohd Tohir, Mohd Zahirasri Awang Biak, Dayang Radiah Syam, Syafiie Recently, there has been an emerging trend to analyse time series data and utilise sophisticated tools for optimally fitting time series models. To date, Malaysian industrial accident data was underutilised and lack of informative record. Thus, this paper aims to investigate the Malaysian accident database and further evaluate the optimal forecasting models in accident prediction. The model's input was based on available data from the Department of Occupational Safety and Health, Malaysia (DOSH) from 2018 until 2021, with 80 of the dataset to train the models and the remaining 20 for validation. The prediction using negative binomial and Poisson distribution showed a mean absolute percentage error (MAPE) of 33 and 51, respectively. This indicated that the negative binomial performed better than the Poisson distribution in accident frequency prediction. The available time series accident data were gathered for four years and stationarity was checked in R Studio software for the Augmented Dickey-Fuller test. The lowest Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC) and other error values were used to justify the best model, which was ARIMA(2,0,2)(2,0,0)(12) model. The ARIMA models were considered after the data showed autocorrelation. The MAPE for both ARIMA in R and manual time series were 40 and 49, respectively. Therefore, the accident prediction by using R Studio would outperform the manually negative binomial and Poisson distribution. Based on the findings, industrial safety practitioners should report accidents to DOSH truthfully in the era of digitalisation. This could enable future data-driven accident predictions to be carried out. Universiti Putra Malaysia Press 2023-04-01 Article PeerReviewed text en http://psasir.upm.edu.my/id/eprint/106503/1/07%20JST-4635-2023.pdf Choo, Boon Chong and Abdul Razak, Musab and Mohd Tohir, Mohd Zahirasri and Awang Biak, Dayang Radiah and Syam, Syafiie (2023) An accident prediction model based on ARIMA in Kuala Lumpur, Malaysia using time series of actual accidents and related data. Pertanika Journal of Science and Technology, 32 (3). 1103 -1122. ISSN 0128-7680; ESSN: 2231-8526 http://www.pertanika.upm.edu.my/pjst/browse/regular-issue?article=JST-4635-2023 10.47836/pjst.32.3.07 |
spellingShingle | Choo, Boon Chong Abdul Razak, Musab Mohd Tohir, Mohd Zahirasri Awang Biak, Dayang Radiah Syam, Syafiie An accident prediction model based on ARIMA in Kuala Lumpur, Malaysia using time series of actual accidents and related data |
title | An accident prediction model based on ARIMA in Kuala Lumpur, Malaysia using time series of actual accidents and related data |
title_full | An accident prediction model based on ARIMA in Kuala Lumpur, Malaysia using time series of actual accidents and related data |
title_fullStr | An accident prediction model based on ARIMA in Kuala Lumpur, Malaysia using time series of actual accidents and related data |
title_full_unstemmed | An accident prediction model based on ARIMA in Kuala Lumpur, Malaysia using time series of actual accidents and related data |
title_short | An accident prediction model based on ARIMA in Kuala Lumpur, Malaysia using time series of actual accidents and related data |
title_sort | accident prediction model based on arima in kuala lumpur malaysia using time series of actual accidents and related data |
url | http://psasir.upm.edu.my/id/eprint/106503/1/07%20JST-4635-2023.pdf |
work_keys_str_mv | AT chooboonchong anaccidentpredictionmodelbasedonarimainkualalumpurmalaysiausingtimeseriesofactualaccidentsandrelateddata AT abdulrazakmusab anaccidentpredictionmodelbasedonarimainkualalumpurmalaysiausingtimeseriesofactualaccidentsandrelateddata AT mohdtohirmohdzahirasri anaccidentpredictionmodelbasedonarimainkualalumpurmalaysiausingtimeseriesofactualaccidentsandrelateddata AT awangbiakdayangradiah anaccidentpredictionmodelbasedonarimainkualalumpurmalaysiausingtimeseriesofactualaccidentsandrelateddata AT syamsyafiie anaccidentpredictionmodelbasedonarimainkualalumpurmalaysiausingtimeseriesofactualaccidentsandrelateddata AT chooboonchong accidentpredictionmodelbasedonarimainkualalumpurmalaysiausingtimeseriesofactualaccidentsandrelateddata AT abdulrazakmusab accidentpredictionmodelbasedonarimainkualalumpurmalaysiausingtimeseriesofactualaccidentsandrelateddata AT mohdtohirmohdzahirasri accidentpredictionmodelbasedonarimainkualalumpurmalaysiausingtimeseriesofactualaccidentsandrelateddata AT awangbiakdayangradiah accidentpredictionmodelbasedonarimainkualalumpurmalaysiausingtimeseriesofactualaccidentsandrelateddata AT syamsyafiie accidentpredictionmodelbasedonarimainkualalumpurmalaysiausingtimeseriesofactualaccidentsandrelateddata |