Classification Prediction of PM<sub>10</sub> Concentration Using a Tree-Based Machine Learning Approach

The PM<sub>10</sub> prediction has received considerable attention due to its harmful effects on human health. Machine learning approaches have the potential to predict and classify future PM<sub>10</sub> concentrations accurately. Therefore, in this study, three machine lear...

Full description

Bibliographic Details
Main Authors: Wan Nur Shaziayani, Ahmad Zia Ul-Saufie, Sofianita Mutalib, Norazian Mohamad Noor, Nazatul Syadia Zainordin
Format: Article
Language:English
Published: MDPI AG 2022-03-01
Series:Atmosphere
Subjects:
Online Access:https://www.mdpi.com/2073-4433/13/4/538
_version_ 1797436994909896704
author Wan Nur Shaziayani
Ahmad Zia Ul-Saufie
Sofianita Mutalib
Norazian Mohamad Noor
Nazatul Syadia Zainordin
author_facet Wan Nur Shaziayani
Ahmad Zia Ul-Saufie
Sofianita Mutalib
Norazian Mohamad Noor
Nazatul Syadia Zainordin
author_sort Wan Nur Shaziayani
collection DOAJ
description The PM<sub>10</sub> prediction has received considerable attention due to its harmful effects on human health. Machine learning approaches have the potential to predict and classify future PM<sub>10</sub> concentrations accurately. Therefore, in this study, three machine learning algorithms—namely, decision tree (DT), boosted regression tree (BRT), and random forest (RF)—were applied for the prediction of PM<sub>10</sub> in Kota Bharu, Kelantan. The results from these three methods were compared to find the best method to predict PM<sub>10</sub> concentration for the next day by using the maximum daily data from January 2002 to December 2017. To this end, 80% of the data were used for training and 20% for validation of the models. The performance measure of the PM<sub>10</sub> concentration was based on accuracy, sensitivity, specificity, and precision for RF, BRT, and DT, respectively, which indicates that these three models were developed effectively, and they are applicable in the prediction of other atmospheric environmental data. The best model to use in predicting the next day’s PM<sub>10</sub> concentration classification was the random forest classifier, with an accuracy of 98.37, sensitivity of 97.19, specificity of 99.55, and precision of 99.54, but the result of the boosted regression tree was substantially different from the RF model, with an accuracy of 98.12, sensitivity of 97.51, specificity of 98.72, and precision of 98.71. The best model can assist local governments in providing early warnings to people who are at risk of acute and chronic health consequences from air pollution.
first_indexed 2024-03-09T11:10:38Z
format Article
id doaj.art-fc62263bcbb249e29075b38161825f6b
institution Directory Open Access Journal
issn 2073-4433
language English
last_indexed 2024-03-09T11:10:38Z
publishDate 2022-03-01
publisher MDPI AG
record_format Article
series Atmosphere
spelling doaj.art-fc62263bcbb249e29075b38161825f6b2023-12-01T00:46:17ZengMDPI AGAtmosphere2073-44332022-03-0113453810.3390/atmos13040538Classification Prediction of PM<sub>10</sub> Concentration Using a Tree-Based Machine Learning ApproachWan Nur Shaziayani0Ahmad Zia Ul-Saufie1Sofianita Mutalib2Norazian Mohamad Noor3Nazatul Syadia Zainordin4Faculty of Computer and Mathematical Sciences, Universiti Teknologi MARA, Shah Alam 40450, Selangor, MalaysiaFaculty of Computer and Mathematical Sciences, Universiti Teknologi MARA, Shah Alam 40450, Selangor, MalaysiaFaculty of Computer and Mathematical Sciences, Universiti Teknologi MARA, Shah Alam 40450, Selangor, MalaysiaFaculty of Civil Engineering Technology, Universiti Malaysia Perlis, Kompleks Pengajian Jejawi 3, Arau 02600, Perlis, MalaysiaDepartment of Environment, Faculty of Forestry and Environment, Universiti Putra Malaysia, Seri Kembangan 43400, Selangor, MalaysiaThe PM<sub>10</sub> prediction has received considerable attention due to its harmful effects on human health. Machine learning approaches have the potential to predict and classify future PM<sub>10</sub> concentrations accurately. Therefore, in this study, three machine learning algorithms—namely, decision tree (DT), boosted regression tree (BRT), and random forest (RF)—were applied for the prediction of PM<sub>10</sub> in Kota Bharu, Kelantan. The results from these three methods were compared to find the best method to predict PM<sub>10</sub> concentration for the next day by using the maximum daily data from January 2002 to December 2017. To this end, 80% of the data were used for training and 20% for validation of the models. The performance measure of the PM<sub>10</sub> concentration was based on accuracy, sensitivity, specificity, and precision for RF, BRT, and DT, respectively, which indicates that these three models were developed effectively, and they are applicable in the prediction of other atmospheric environmental data. The best model to use in predicting the next day’s PM<sub>10</sub> concentration classification was the random forest classifier, with an accuracy of 98.37, sensitivity of 97.19, specificity of 99.55, and precision of 99.54, but the result of the boosted regression tree was substantially different from the RF model, with an accuracy of 98.12, sensitivity of 97.51, specificity of 98.72, and precision of 98.71. The best model can assist local governments in providing early warnings to people who are at risk of acute and chronic health consequences from air pollution.https://www.mdpi.com/2073-4433/13/4/538PM<sub>10</sub>predictiondecision treeboosted regression treerandom forest
spellingShingle Wan Nur Shaziayani
Ahmad Zia Ul-Saufie
Sofianita Mutalib
Norazian Mohamad Noor
Nazatul Syadia Zainordin
Classification Prediction of PM<sub>10</sub> Concentration Using a Tree-Based Machine Learning Approach
Atmosphere
PM<sub>10</sub>
prediction
decision tree
boosted regression tree
random forest
title Classification Prediction of PM<sub>10</sub> Concentration Using a Tree-Based Machine Learning Approach
title_full Classification Prediction of PM<sub>10</sub> Concentration Using a Tree-Based Machine Learning Approach
title_fullStr Classification Prediction of PM<sub>10</sub> Concentration Using a Tree-Based Machine Learning Approach
title_full_unstemmed Classification Prediction of PM<sub>10</sub> Concentration Using a Tree-Based Machine Learning Approach
title_short Classification Prediction of PM<sub>10</sub> Concentration Using a Tree-Based Machine Learning Approach
title_sort classification prediction of pm sub 10 sub concentration using a tree based machine learning approach
topic PM<sub>10</sub>
prediction
decision tree
boosted regression tree
random forest
url https://www.mdpi.com/2073-4433/13/4/538
work_keys_str_mv AT wannurshaziayani classificationpredictionofpmsub10subconcentrationusingatreebasedmachinelearningapproach
AT ahmadziaulsaufie classificationpredictionofpmsub10subconcentrationusingatreebasedmachinelearningapproach
AT sofianitamutalib classificationpredictionofpmsub10subconcentrationusingatreebasedmachinelearningapproach
AT norazianmohamadnoor classificationpredictionofpmsub10subconcentrationusingatreebasedmachinelearningapproach
AT nazatulsyadiazainordin classificationpredictionofpmsub10subconcentrationusingatreebasedmachinelearningapproach