The effectiveness of a probabilistic principal component analysis model and expectation maximisation algorithm in treating missing daily rainfall data
The reliability and accuracy of a risk assessment of extreme hydro-meteorological events are highly dependent on the quality of the historical rainfall time series data. However, missing data in a time series such as this could result in lower quality data. Therefore, this paper proposes a multiple-...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English English |
Published: |
Korean Meteorological Society and Springer Nature
2020
|
Subjects: | |
Online Access: | http://umpir.ump.edu.my/id/eprint/30291/1/The%20Effectiveness%20of%20a%20Probabilistic%20Principal%20Component%20Analysis%20Model%20and%20Expectation%20Maximisation%20Algorithm%20in%20Treating%20Missing%20Daily%20Rainfall%20Data.pdf http://umpir.ump.edu.my/id/eprint/30291/7/The%20Effectiveness%20of%20a%20Probabilistic%20Principal%20Component%20Analysis.pdf |
_version_ | 1825813657402998784 |
---|---|
author | Chuan, Zun Liang Sayang, Mohd Deni Fam, Soo-Fen Noriszura, Ismail |
author_facet | Chuan, Zun Liang Sayang, Mohd Deni Fam, Soo-Fen Noriszura, Ismail |
author_sort | Chuan, Zun Liang |
collection | UMP |
description | The reliability and accuracy of a risk assessment of extreme hydro-meteorological events are highly dependent on the quality of the historical rainfall time series data. However, missing data in a time series such as this could result in lower quality data. Therefore, this paper proposes a multiple-imputation algorithm for treating missing data without requiring information from adjoining monitoring stations. The proposed imputation algorithms are based on the M-component probabilistic principal component analysis model and an expectation maximisation algorithm (MPPCA-EM). In order to evaluate the effectiveness of
the MPPCA-EM imputation algorithm, six distinct historical daily rainfall time series data were recorded from six monitoring stations. These stations were located at the coastal and inland regions of the East-Coast Economic Region (ECER) Malaysia. The results of analysis show that, when it comes to treating missing historical daily rainfall time series data recorded from coastal monitoring stations, the 2-component probabilistic principal component analysis model and expectation-maximisation algorithm (2PPCA-EM) were found to be superior to the single- and multiple-imputation algorithms proposed in previous studies. On the contrary, the single-imputation algorithms as proposed in previous studies were superior to the MPPCA-EM imputation algorithms when treating missing historical daily rainfall time series data recorded from inland monitoring stations. |
first_indexed | 2024-03-06T12:47:16Z |
format | Article |
id | UMPir30291 |
institution | Universiti Malaysia Pahang |
language | English English |
last_indexed | 2024-03-06T12:47:16Z |
publishDate | 2020 |
publisher | Korean Meteorological Society and Springer Nature |
record_format | dspace |
spelling | UMPir302912022-01-17T04:40:29Z http://umpir.ump.edu.my/id/eprint/30291/ The effectiveness of a probabilistic principal component analysis model and expectation maximisation algorithm in treating missing daily rainfall data Chuan, Zun Liang Sayang, Mohd Deni Fam, Soo-Fen Noriszura, Ismail GE Environmental Sciences The reliability and accuracy of a risk assessment of extreme hydro-meteorological events are highly dependent on the quality of the historical rainfall time series data. However, missing data in a time series such as this could result in lower quality data. Therefore, this paper proposes a multiple-imputation algorithm for treating missing data without requiring information from adjoining monitoring stations. The proposed imputation algorithms are based on the M-component probabilistic principal component analysis model and an expectation maximisation algorithm (MPPCA-EM). In order to evaluate the effectiveness of the MPPCA-EM imputation algorithm, six distinct historical daily rainfall time series data were recorded from six monitoring stations. These stations were located at the coastal and inland regions of the East-Coast Economic Region (ECER) Malaysia. The results of analysis show that, when it comes to treating missing historical daily rainfall time series data recorded from coastal monitoring stations, the 2-component probabilistic principal component analysis model and expectation-maximisation algorithm (2PPCA-EM) were found to be superior to the single- and multiple-imputation algorithms proposed in previous studies. On the contrary, the single-imputation algorithms as proposed in previous studies were superior to the MPPCA-EM imputation algorithms when treating missing historical daily rainfall time series data recorded from inland monitoring stations. Korean Meteorological Society and Springer Nature 2020 Article PeerReviewed pdf en http://umpir.ump.edu.my/id/eprint/30291/1/The%20Effectiveness%20of%20a%20Probabilistic%20Principal%20Component%20Analysis%20Model%20and%20Expectation%20Maximisation%20Algorithm%20in%20Treating%20Missing%20Daily%20Rainfall%20Data.pdf pdf en http://umpir.ump.edu.my/id/eprint/30291/7/The%20Effectiveness%20of%20a%20Probabilistic%20Principal%20Component%20Analysis.pdf Chuan, Zun Liang and Sayang, Mohd Deni and Fam, Soo-Fen and Noriszura, Ismail (2020) The effectiveness of a probabilistic principal component analysis model and expectation maximisation algorithm in treating missing daily rainfall data. Asia-Pacific Journal of Atmospheric Sciences, 56 (1). pp. 119-129. ISSN 1976-7633. (Published) https://doi.org/10.1007/s13143-019-00135-8 https://doi.org/10.1007/s13143-019-00135-8 |
spellingShingle | GE Environmental Sciences Chuan, Zun Liang Sayang, Mohd Deni Fam, Soo-Fen Noriszura, Ismail The effectiveness of a probabilistic principal component analysis model and expectation maximisation algorithm in treating missing daily rainfall data |
title | The effectiveness of a probabilistic principal component analysis model and expectation maximisation algorithm in treating missing daily rainfall data |
title_full | The effectiveness of a probabilistic principal component analysis model and expectation maximisation algorithm in treating missing daily rainfall data |
title_fullStr | The effectiveness of a probabilistic principal component analysis model and expectation maximisation algorithm in treating missing daily rainfall data |
title_full_unstemmed | The effectiveness of a probabilistic principal component analysis model and expectation maximisation algorithm in treating missing daily rainfall data |
title_short | The effectiveness of a probabilistic principal component analysis model and expectation maximisation algorithm in treating missing daily rainfall data |
title_sort | effectiveness of a probabilistic principal component analysis model and expectation maximisation algorithm in treating missing daily rainfall data |
topic | GE Environmental Sciences |
url | http://umpir.ump.edu.my/id/eprint/30291/1/The%20Effectiveness%20of%20a%20Probabilistic%20Principal%20Component%20Analysis%20Model%20and%20Expectation%20Maximisation%20Algorithm%20in%20Treating%20Missing%20Daily%20Rainfall%20Data.pdf http://umpir.ump.edu.my/id/eprint/30291/7/The%20Effectiveness%20of%20a%20Probabilistic%20Principal%20Component%20Analysis.pdf |
work_keys_str_mv | AT chuanzunliang theeffectivenessofaprobabilisticprincipalcomponentanalysismodelandexpectationmaximisationalgorithmintreatingmissingdailyrainfalldata AT sayangmohddeni theeffectivenessofaprobabilisticprincipalcomponentanalysismodelandexpectationmaximisationalgorithmintreatingmissingdailyrainfalldata AT famsoofen theeffectivenessofaprobabilisticprincipalcomponentanalysismodelandexpectationmaximisationalgorithmintreatingmissingdailyrainfalldata AT noriszuraismail theeffectivenessofaprobabilisticprincipalcomponentanalysismodelandexpectationmaximisationalgorithmintreatingmissingdailyrainfalldata AT chuanzunliang effectivenessofaprobabilisticprincipalcomponentanalysismodelandexpectationmaximisationalgorithmintreatingmissingdailyrainfalldata AT sayangmohddeni effectivenessofaprobabilisticprincipalcomponentanalysismodelandexpectationmaximisationalgorithmintreatingmissingdailyrainfalldata AT famsoofen effectivenessofaprobabilisticprincipalcomponentanalysismodelandexpectationmaximisationalgorithmintreatingmissingdailyrainfalldata AT noriszuraismail effectivenessofaprobabilisticprincipalcomponentanalysismodelandexpectationmaximisationalgorithmintreatingmissingdailyrainfalldata |