Summary: | Introduction: The wide range of studies on air pollution requires accurate and reliable datasets. However, due to many reasons, the measured concentrations may be incomplete or biased. The development of an easy-to-use and reproducible exposure assessment method is required for researchers. Therefore, in this article, we describe and present a series of codes written in R Programming Language for data handling, validating and averaging of PM10, PM2.5, and O3 datasets.
Findings: These codes can be used in any types of air pollution studies that seek for PM and ozone concentrations that are indicator of real concentrations. We used and combined criteria from several guidelines proposed by US EPA and APHEKOM project to obtain an acceptable methodology. Separate .csv files for PM10, PM2.5 and O3 should be prepared as input file. After the file was imported to the R Programming software, first, negative and zero values of concentrations within all the dataset will be removed. Then, only monitors will be selected that have at least 75% of hourly concentrations. Then, 24-h averages and daily maximum of 8-h moving averages will be calculated for PM and ozone, respectively. For output, the codes create two different sets of data. One contains the hourly concentrations of the interest pollutant (PM10, PM2.5, or O3) in valid stations and their average at city level. Another is the final 24-h averages of city for PM10 and PM2.5 or the final daily maximum 8-h averages of city for O3.
Conclusion: These validated codes use a reliable and valid methodology, and eliminate the possibility of wrong or mistaken data handling and averaging. The use of these codes are free and without any limitation, only after the citation to this article.
|