A comparison of various imputation methods for missing values in air quality data

This paper presents various imputation methods for air quality data specifically in Malaysia. The main objective was to select the best method of imputation and to compare whether there was any difference in the methods used between stations in Peninsular Malaysia. Missing data for various cases are...

Cijeli opis

Bibliografski detalji
Glavni autori: Nuryazmin Ahmat Zainuri, Abdul Aziz Jemain, Nora Muda
Format: Članak
Jezik:English
Izdano: Universiti Kebangsaan Malaysia 2015
Online pristup:http://journalarticle.ukm.my/8488/1/17_NuryAzmin.pdf
_version_ 1825726261905850368
author Nuryazmin Ahmat Zainuri,
Abdul Aziz Jemain,
Nora Muda,
author_facet Nuryazmin Ahmat Zainuri,
Abdul Aziz Jemain,
Nora Muda,
author_sort Nuryazmin Ahmat Zainuri,
collection UKM
description This paper presents various imputation methods for air quality data specifically in Malaysia. The main objective was to select the best method of imputation and to compare whether there was any difference in the methods used between stations in Peninsular Malaysia. Missing data for various cases are randomly simulated with 5, 10, 15, 20, 25 and 30% missing. Six methods used in this paper were mean and median substitution, expectation-maximization (EM) method, singular value decomposition (SVD), K-nearest neighbour (KNN) method and sequential K-nearest neighbour (SKNN) method. The performance of the imputations is compared using the performance indicator: The correlation coefficient (R), the index of agreement (d) and the mean absolute error (MAE). Based on the result obtained, it can be concluded that EM, KNN and SKNN are the three best methods. The same result are obtained for all the eight monitoring station used in this study.
first_indexed 2024-03-06T04:08:13Z
format Article
id ukm.eprints-8488
institution Universiti Kebangsaan Malaysia
language English
last_indexed 2024-03-06T04:08:13Z
publishDate 2015
publisher Universiti Kebangsaan Malaysia
record_format dspace
spelling ukm.eprints-84882016-12-14T06:47:19Z http://journalarticle.ukm.my/8488/ A comparison of various imputation methods for missing values in air quality data Nuryazmin Ahmat Zainuri, Abdul Aziz Jemain, Nora Muda, This paper presents various imputation methods for air quality data specifically in Malaysia. The main objective was to select the best method of imputation and to compare whether there was any difference in the methods used between stations in Peninsular Malaysia. Missing data for various cases are randomly simulated with 5, 10, 15, 20, 25 and 30% missing. Six methods used in this paper were mean and median substitution, expectation-maximization (EM) method, singular value decomposition (SVD), K-nearest neighbour (KNN) method and sequential K-nearest neighbour (SKNN) method. The performance of the imputations is compared using the performance indicator: The correlation coefficient (R), the index of agreement (d) and the mean absolute error (MAE). Based on the result obtained, it can be concluded that EM, KNN and SKNN are the three best methods. The same result are obtained for all the eight monitoring station used in this study. Universiti Kebangsaan Malaysia 2015-03 Article PeerReviewed application/pdf en http://journalarticle.ukm.my/8488/1/17_NuryAzmin.pdf Nuryazmin Ahmat Zainuri, and Abdul Aziz Jemain, and Nora Muda, (2015) A comparison of various imputation methods for missing values in air quality data. Sains Malaysiana, 44 (3). pp. 449-456. ISSN 0126-6039 http://www.ukm.my/jsm/
spellingShingle Nuryazmin Ahmat Zainuri,
Abdul Aziz Jemain,
Nora Muda,
A comparison of various imputation methods for missing values in air quality data
title A comparison of various imputation methods for missing values in air quality data
title_full A comparison of various imputation methods for missing values in air quality data
title_fullStr A comparison of various imputation methods for missing values in air quality data
title_full_unstemmed A comparison of various imputation methods for missing values in air quality data
title_short A comparison of various imputation methods for missing values in air quality data
title_sort comparison of various imputation methods for missing values in air quality data
url http://journalarticle.ukm.my/8488/1/17_NuryAzmin.pdf
work_keys_str_mv AT nuryazminahmatzainuri acomparisonofvariousimputationmethodsformissingvaluesinairqualitydata
AT abdulazizjemain acomparisonofvariousimputationmethodsformissingvaluesinairqualitydata
AT noramuda acomparisonofvariousimputationmethodsformissingvaluesinairqualitydata
AT nuryazminahmatzainuri comparisonofvariousimputationmethodsformissingvaluesinairqualitydata
AT abdulazizjemain comparisonofvariousimputationmethodsformissingvaluesinairqualitydata
AT noramuda comparisonofvariousimputationmethodsformissingvaluesinairqualitydata