Developing codes for validation of PM10, PM2.5, and O3 datasets using Rprogramming language

Introduction: The wide range of studies on air pollution requires accurate and reliable datasets. However, due to many reasons, the measured concentrations may be incomplete or biased. The development of an easy-to-use and reproducible exposure assessment method is required for researchers. Therefor...

Full description

Bibliographic Details
Main Authors: Ramin Nabizadeh Nodehi, Mostafa Hadei
Format: Article
Language:English
Published: Tehran University of Medical Sciences 2019-02-01
Series:Journal of Air Pollution and Health
Subjects:
Online Access:https://japh.tums.ac.ir/index.php/japh/article/view/198
_version_ 1819025557104361472
author Ramin Nabizadeh Nodehi
Mostafa Hadei
author_facet Ramin Nabizadeh Nodehi
Mostafa Hadei
author_sort Ramin Nabizadeh Nodehi
collection DOAJ
description Introduction: The wide range of studies on air pollution requires accurate and reliable datasets. However, due to many reasons, the measured concentrations may be incomplete or biased. The development of an easy-to-use and reproducible exposure assessment method is required for researchers. Therefore, in this article, we describe and present a series of codes written in R Programming Language for data handling, validating and averaging of PM10, PM2.5, and O3 datasets. Findings: These codes can be used in any types of air pollution studies that seek for PM and ozone concentrations that are indicator of real concentrations. We used and combined criteria from several guidelines proposed by US EPA and APHEKOM project to obtain an acceptable methodology. Separate .csv files for PM10, PM2.5 and O3 should be prepared as input file. After the file was imported to the R Programming software, first, negative and zero values of concentrations within all the dataset will be removed. Then, only monitors will be selected that have at least 75% of hourly concentrations. Then, 24-h averages and daily maximum of 8-h moving averages will be calculated for PM and ozone, respectively. For output, the codes create two different sets of data. One contains the hourly concentrations of the interest pollutant (PM10, PM2.5, or O3) in valid stations and their average at city level. Another is the final 24-h averages of city for PM10 and PM2.5 or the final daily maximum 8-h averages of city for O3. Conclusion: These validated codes use a reliable and valid methodology, and eliminate the possibility of wrong or mistaken data handling and averaging. The use of these codes are free and without any limitation, only after the citation to this article.
first_indexed 2024-12-21T05:12:34Z
format Article
id doaj.art-35412ee89f4f4d969af7ab202c54a1de
institution Directory Open Access Journal
issn 2476-3071
language English
last_indexed 2024-12-21T05:12:34Z
publishDate 2019-02-01
publisher Tehran University of Medical Sciences
record_format Article
series Journal of Air Pollution and Health
spelling doaj.art-35412ee89f4f4d969af7ab202c54a1de2022-12-21T19:15:00ZengTehran University of Medical SciencesJournal of Air Pollution and Health2476-30712019-02-014110.18502/japh.v4i1.604Developing codes for validation of PM10, PM2.5, and O3 datasets using Rprogramming languageRamin Nabizadeh Nodehi0Mostafa Hadei1Department of Environmental Health Engineering, School of Public Health, Tehran University of Medical Sciences, Tehran, IranDepartment of Environmental Health Engineering, School of Public Health, Tehran University of Medical Sciences, Tehran, IranIntroduction: The wide range of studies on air pollution requires accurate and reliable datasets. However, due to many reasons, the measured concentrations may be incomplete or biased. The development of an easy-to-use and reproducible exposure assessment method is required for researchers. Therefore, in this article, we describe and present a series of codes written in R Programming Language for data handling, validating and averaging of PM10, PM2.5, and O3 datasets. Findings: These codes can be used in any types of air pollution studies that seek for PM and ozone concentrations that are indicator of real concentrations. We used and combined criteria from several guidelines proposed by US EPA and APHEKOM project to obtain an acceptable methodology. Separate .csv files for PM10, PM2.5 and O3 should be prepared as input file. After the file was imported to the R Programming software, first, negative and zero values of concentrations within all the dataset will be removed. Then, only monitors will be selected that have at least 75% of hourly concentrations. Then, 24-h averages and daily maximum of 8-h moving averages will be calculated for PM and ozone, respectively. For output, the codes create two different sets of data. One contains the hourly concentrations of the interest pollutant (PM10, PM2.5, or O3) in valid stations and their average at city level. Another is the final 24-h averages of city for PM10 and PM2.5 or the final daily maximum 8-h averages of city for O3. Conclusion: These validated codes use a reliable and valid methodology, and eliminate the possibility of wrong or mistaken data handling and averaging. The use of these codes are free and without any limitation, only after the citation to this article.https://japh.tums.ac.ir/index.php/japh/article/view/198Exposure assessment; Particulate matter; Air pollution; Epidemiology; Health impact assessment
spellingShingle Ramin Nabizadeh Nodehi
Mostafa Hadei
Developing codes for validation of PM10, PM2.5, and O3 datasets using Rprogramming language
Journal of Air Pollution and Health
Exposure assessment; Particulate matter; Air pollution; Epidemiology; Health impact assessment
title Developing codes for validation of PM10, PM2.5, and O3 datasets using Rprogramming language
title_full Developing codes for validation of PM10, PM2.5, and O3 datasets using Rprogramming language
title_fullStr Developing codes for validation of PM10, PM2.5, and O3 datasets using Rprogramming language
title_full_unstemmed Developing codes for validation of PM10, PM2.5, and O3 datasets using Rprogramming language
title_short Developing codes for validation of PM10, PM2.5, and O3 datasets using Rprogramming language
title_sort developing codes for validation of pm10 pm2 5 and o3 datasets using rprogramming language
topic Exposure assessment; Particulate matter; Air pollution; Epidemiology; Health impact assessment
url https://japh.tums.ac.ir/index.php/japh/article/view/198
work_keys_str_mv AT raminnabizadehnodehi developingcodesforvalidationofpm10pm25ando3datasetsusingrprogramminglanguage
AT mostafahadei developingcodesforvalidationofpm10pm25ando3datasetsusingrprogramminglanguage