A spatial feature engineering algorithm for creating air pollution health datasets
Air pollution is one of the significant causes of mortality and morbidity every year. In recent years, many researchers have focused their attention on the associations of air pollution and health. Air pollution data and health data is used in these studies and feature engineering is used to create...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
KeAi Communications Co., Ltd.
2020-06-01
|
Series: | International Journal of Cognitive Computing in Engineering |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S2666307420300115 |
_version_ | 1797976444845948928 |
---|---|
author | Raja Sher Afgun Usmani Thulasyammal Ramiah Pillai Ibrahim Abaker Targio Hashem Noor Zaman Jhanjhi Anum Saeed Akibu Mahmoud Abdullahi |
author_facet | Raja Sher Afgun Usmani Thulasyammal Ramiah Pillai Ibrahim Abaker Targio Hashem Noor Zaman Jhanjhi Anum Saeed Akibu Mahmoud Abdullahi |
author_sort | Raja Sher Afgun Usmani |
collection | DOAJ |
description | Air pollution is one of the significant causes of mortality and morbidity every year. In recent years, many researchers have focused their attention on the associations of air pollution and health. Air pollution data and health data is used in these studies and feature engineering is used to create and optimize the air quality and health features. In order to associate these datasets, the residential address, community/county/block/city, and hospital/school address are utilized as association parameters. A spatial problem is raised when the Air Quality Monitoring (AQM) stations are concentrated in urban areas within the regions, and the residential address or any other spatial parameter is used. An intersection of AQM stations coverage in urban areas is observed where AQM stations are operating in close proximity, which raises the question of how to associate the patients with the relevant AQM station. In most studies, the distance of patients to the AQM stations is also not taken into account. In this study, we propose a spatial feature engineering algorithm with functions to find the coordinates for patients, calculate distances to the AQM stations, and associate patient records to the nearest AQM station. Hence, removing the limitations of current air pollution health datasets. The proposed algorithm is applied to a case study in Klang Valley, Malaysia. The results show that the proposed algorithm can generate air pollution health datasets efficiently, and it also provides the radius facility to exclude the patients who are situated far away from the stations. |
first_indexed | 2024-04-11T04:50:57Z |
format | Article |
id | doaj.art-58345826e6b2406491e1861e2d77475a |
institution | Directory Open Access Journal |
issn | 2666-3074 |
language | English |
last_indexed | 2024-04-11T04:50:57Z |
publishDate | 2020-06-01 |
publisher | KeAi Communications Co., Ltd. |
record_format | Article |
series | International Journal of Cognitive Computing in Engineering |
spelling | doaj.art-58345826e6b2406491e1861e2d77475a2022-12-27T04:37:11ZengKeAi Communications Co., Ltd.International Journal of Cognitive Computing in Engineering2666-30742020-06-01198107A spatial feature engineering algorithm for creating air pollution health datasetsRaja Sher Afgun Usmani0Thulasyammal Ramiah Pillai1Ibrahim Abaker Targio Hashem2Noor Zaman Jhanjhi3Anum Saeed4Akibu Mahmoud Abdullahi5School of Computer Science and Engineering, Taylor’s University, Subang Jaya, Selangor, Malaysia; Corresponding author.School of Computer Science and Engineering, Taylor’s University, Subang Jaya, Selangor, MalaysiaCollege of Computing and Informatics, Department of Computer Science, University of Sharjah, Sharjah 27272, UAESchool of Computer Science and Engineering, Taylor’s University, Subang Jaya, Selangor, MalaysiaCenter for Advance Studies in Engineering, Islamabad, PakistanSchool of Computer Science and Engineering, Taylor’s University, Subang Jaya, Selangor, MalaysiaAir pollution is one of the significant causes of mortality and morbidity every year. In recent years, many researchers have focused their attention on the associations of air pollution and health. Air pollution data and health data is used in these studies and feature engineering is used to create and optimize the air quality and health features. In order to associate these datasets, the residential address, community/county/block/city, and hospital/school address are utilized as association parameters. A spatial problem is raised when the Air Quality Monitoring (AQM) stations are concentrated in urban areas within the regions, and the residential address or any other spatial parameter is used. An intersection of AQM stations coverage in urban areas is observed where AQM stations are operating in close proximity, which raises the question of how to associate the patients with the relevant AQM station. In most studies, the distance of patients to the AQM stations is also not taken into account. In this study, we propose a spatial feature engineering algorithm with functions to find the coordinates for patients, calculate distances to the AQM stations, and associate patient records to the nearest AQM station. Hence, removing the limitations of current air pollution health datasets. The proposed algorithm is applied to a case study in Klang Valley, Malaysia. The results show that the proposed algorithm can generate air pollution health datasets efficiently, and it also provides the radius facility to exclude the patients who are situated far away from the stations.http://www.sciencedirect.com/science/article/pii/S2666307420300115Air pollutionFeature engineeringHealthAir qualityHospitalizationMortality |
spellingShingle | Raja Sher Afgun Usmani Thulasyammal Ramiah Pillai Ibrahim Abaker Targio Hashem Noor Zaman Jhanjhi Anum Saeed Akibu Mahmoud Abdullahi A spatial feature engineering algorithm for creating air pollution health datasets International Journal of Cognitive Computing in Engineering Air pollution Feature engineering Health Air quality Hospitalization Mortality |
title | A spatial feature engineering algorithm for creating air pollution health datasets |
title_full | A spatial feature engineering algorithm for creating air pollution health datasets |
title_fullStr | A spatial feature engineering algorithm for creating air pollution health datasets |
title_full_unstemmed | A spatial feature engineering algorithm for creating air pollution health datasets |
title_short | A spatial feature engineering algorithm for creating air pollution health datasets |
title_sort | spatial feature engineering algorithm for creating air pollution health datasets |
topic | Air pollution Feature engineering Health Air quality Hospitalization Mortality |
url | http://www.sciencedirect.com/science/article/pii/S2666307420300115 |
work_keys_str_mv | AT rajasherafgunusmani aspatialfeatureengineeringalgorithmforcreatingairpollutionhealthdatasets AT thulasyammalramiahpillai aspatialfeatureengineeringalgorithmforcreatingairpollutionhealthdatasets AT ibrahimabakertargiohashem aspatialfeatureengineeringalgorithmforcreatingairpollutionhealthdatasets AT noorzamanjhanjhi aspatialfeatureengineeringalgorithmforcreatingairpollutionhealthdatasets AT anumsaeed aspatialfeatureengineeringalgorithmforcreatingairpollutionhealthdatasets AT akibumahmoudabdullahi aspatialfeatureengineeringalgorithmforcreatingairpollutionhealthdatasets AT rajasherafgunusmani spatialfeatureengineeringalgorithmforcreatingairpollutionhealthdatasets AT thulasyammalramiahpillai spatialfeatureengineeringalgorithmforcreatingairpollutionhealthdatasets AT ibrahimabakertargiohashem spatialfeatureengineeringalgorithmforcreatingairpollutionhealthdatasets AT noorzamanjhanjhi spatialfeatureengineeringalgorithmforcreatingairpollutionhealthdatasets AT anumsaeed spatialfeatureengineeringalgorithmforcreatingairpollutionhealthdatasets AT akibumahmoudabdullahi spatialfeatureengineeringalgorithmforcreatingairpollutionhealthdatasets |