A spatial feature engineering algorithm for creating air pollution health datasets

Air pollution is one of the significant causes of mortality and morbidity every year. In recent years, many researchers have focused their attention on the associations of air pollution and health. Air pollution data and health data is used in these studies and feature engineering is used to create...

Full description

Bibliographic Details
Main Authors: Raja Sher Afgun Usmani, Thulasyammal Ramiah Pillai, Ibrahim Abaker Targio Hashem, Noor Zaman Jhanjhi, Anum Saeed, Akibu Mahmoud Abdullahi
Format: Article
Language:English
Published: KeAi Communications Co., Ltd. 2020-06-01
Series:International Journal of Cognitive Computing in Engineering
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2666307420300115
_version_ 1797976444845948928
author Raja Sher Afgun Usmani
Thulasyammal Ramiah Pillai
Ibrahim Abaker Targio Hashem
Noor Zaman Jhanjhi
Anum Saeed
Akibu Mahmoud Abdullahi
author_facet Raja Sher Afgun Usmani
Thulasyammal Ramiah Pillai
Ibrahim Abaker Targio Hashem
Noor Zaman Jhanjhi
Anum Saeed
Akibu Mahmoud Abdullahi
author_sort Raja Sher Afgun Usmani
collection DOAJ
description Air pollution is one of the significant causes of mortality and morbidity every year. In recent years, many researchers have focused their attention on the associations of air pollution and health. Air pollution data and health data is used in these studies and feature engineering is used to create and optimize the air quality and health features. In order to associate these datasets, the residential address, community/county/block/city, and hospital/school address are utilized as association parameters. A spatial problem is raised when the Air Quality Monitoring (AQM) stations are concentrated in urban areas within the regions, and the residential address or any other spatial parameter is used. An intersection of AQM stations coverage in urban areas is observed where AQM stations are operating in close proximity, which raises the question of how to associate the patients with the relevant AQM station. In most studies, the distance of patients to the AQM stations is also not taken into account. In this study, we propose a spatial feature engineering algorithm with functions to find the coordinates for patients, calculate distances to the AQM stations, and associate patient records to the nearest AQM station. Hence, removing the limitations of current air pollution health datasets. The proposed algorithm is applied to a case study in Klang Valley, Malaysia. The results show that the proposed algorithm can generate air pollution health datasets efficiently, and it also provides the radius facility to exclude the patients who are situated far away from the stations.
first_indexed 2024-04-11T04:50:57Z
format Article
id doaj.art-58345826e6b2406491e1861e2d77475a
institution Directory Open Access Journal
issn 2666-3074
language English
last_indexed 2024-04-11T04:50:57Z
publishDate 2020-06-01
publisher KeAi Communications Co., Ltd.
record_format Article
series International Journal of Cognitive Computing in Engineering
spelling doaj.art-58345826e6b2406491e1861e2d77475a2022-12-27T04:37:11ZengKeAi Communications Co., Ltd.International Journal of Cognitive Computing in Engineering2666-30742020-06-01198107A spatial feature engineering algorithm for creating air pollution health datasetsRaja Sher Afgun Usmani0Thulasyammal Ramiah Pillai1Ibrahim Abaker Targio Hashem2Noor Zaman Jhanjhi3Anum Saeed4Akibu Mahmoud Abdullahi5School of Computer Science and Engineering, Taylor’s University, Subang Jaya, Selangor, Malaysia; Corresponding author.School of Computer Science and Engineering, Taylor’s University, Subang Jaya, Selangor, MalaysiaCollege of Computing and Informatics, Department of Computer Science, University of Sharjah, Sharjah 27272, UAESchool of Computer Science and Engineering, Taylor’s University, Subang Jaya, Selangor, MalaysiaCenter for Advance Studies in Engineering, Islamabad, PakistanSchool of Computer Science and Engineering, Taylor’s University, Subang Jaya, Selangor, MalaysiaAir pollution is one of the significant causes of mortality and morbidity every year. In recent years, many researchers have focused their attention on the associations of air pollution and health. Air pollution data and health data is used in these studies and feature engineering is used to create and optimize the air quality and health features. In order to associate these datasets, the residential address, community/county/block/city, and hospital/school address are utilized as association parameters. A spatial problem is raised when the Air Quality Monitoring (AQM) stations are concentrated in urban areas within the regions, and the residential address or any other spatial parameter is used. An intersection of AQM stations coverage in urban areas is observed where AQM stations are operating in close proximity, which raises the question of how to associate the patients with the relevant AQM station. In most studies, the distance of patients to the AQM stations is also not taken into account. In this study, we propose a spatial feature engineering algorithm with functions to find the coordinates for patients, calculate distances to the AQM stations, and associate patient records to the nearest AQM station. Hence, removing the limitations of current air pollution health datasets. The proposed algorithm is applied to a case study in Klang Valley, Malaysia. The results show that the proposed algorithm can generate air pollution health datasets efficiently, and it also provides the radius facility to exclude the patients who are situated far away from the stations.http://www.sciencedirect.com/science/article/pii/S2666307420300115Air pollutionFeature engineeringHealthAir qualityHospitalizationMortality
spellingShingle Raja Sher Afgun Usmani
Thulasyammal Ramiah Pillai
Ibrahim Abaker Targio Hashem
Noor Zaman Jhanjhi
Anum Saeed
Akibu Mahmoud Abdullahi
A spatial feature engineering algorithm for creating air pollution health datasets
International Journal of Cognitive Computing in Engineering
Air pollution
Feature engineering
Health
Air quality
Hospitalization
Mortality
title A spatial feature engineering algorithm for creating air pollution health datasets
title_full A spatial feature engineering algorithm for creating air pollution health datasets
title_fullStr A spatial feature engineering algorithm for creating air pollution health datasets
title_full_unstemmed A spatial feature engineering algorithm for creating air pollution health datasets
title_short A spatial feature engineering algorithm for creating air pollution health datasets
title_sort spatial feature engineering algorithm for creating air pollution health datasets
topic Air pollution
Feature engineering
Health
Air quality
Hospitalization
Mortality
url http://www.sciencedirect.com/science/article/pii/S2666307420300115
work_keys_str_mv AT rajasherafgunusmani aspatialfeatureengineeringalgorithmforcreatingairpollutionhealthdatasets
AT thulasyammalramiahpillai aspatialfeatureengineeringalgorithmforcreatingairpollutionhealthdatasets
AT ibrahimabakertargiohashem aspatialfeatureengineeringalgorithmforcreatingairpollutionhealthdatasets
AT noorzamanjhanjhi aspatialfeatureengineeringalgorithmforcreatingairpollutionhealthdatasets
AT anumsaeed aspatialfeatureengineeringalgorithmforcreatingairpollutionhealthdatasets
AT akibumahmoudabdullahi aspatialfeatureengineeringalgorithmforcreatingairpollutionhealthdatasets
AT rajasherafgunusmani spatialfeatureengineeringalgorithmforcreatingairpollutionhealthdatasets
AT thulasyammalramiahpillai spatialfeatureengineeringalgorithmforcreatingairpollutionhealthdatasets
AT ibrahimabakertargiohashem spatialfeatureengineeringalgorithmforcreatingairpollutionhealthdatasets
AT noorzamanjhanjhi spatialfeatureengineeringalgorithmforcreatingairpollutionhealthdatasets
AT anumsaeed spatialfeatureengineeringalgorithmforcreatingairpollutionhealthdatasets
AT akibumahmoudabdullahi spatialfeatureengineeringalgorithmforcreatingairpollutionhealthdatasets