The effectiveness of a probabilistic principal component analysis model and expectation maximisation algorithm in treating missing daily rainfall data

The reliability and accuracy of a risk assessment of extreme hydro-meteorological events are highly dependent on the quality of the historical rainfall time series data. However, missing data in a time series such as this could result in lower quality data. Therefore, this paper proposes a multiple-...

Full description

Bibliographic Details
Main Authors: Chuan, Zun Liang, Sayang, Mohd Deni, Fam, Soo-Fen, Noriszura, Ismail
Format: Article
Language:English
English
Published: Korean Meteorological Society and Springer Nature 2020
Subjects:
Online Access:http://umpir.ump.edu.my/id/eprint/30291/1/The%20Effectiveness%20of%20a%20Probabilistic%20Principal%20Component%20Analysis%20Model%20and%20Expectation%20Maximisation%20Algorithm%20in%20Treating%20Missing%20Daily%20Rainfall%20Data.pdf
http://umpir.ump.edu.my/id/eprint/30291/7/The%20Effectiveness%20of%20a%20Probabilistic%20Principal%20Component%20Analysis.pdf
_version_ 1825813657402998784
author Chuan, Zun Liang
Sayang, Mohd Deni
Fam, Soo-Fen
Noriszura, Ismail
author_facet Chuan, Zun Liang
Sayang, Mohd Deni
Fam, Soo-Fen
Noriszura, Ismail
author_sort Chuan, Zun Liang
collection UMP
description The reliability and accuracy of a risk assessment of extreme hydro-meteorological events are highly dependent on the quality of the historical rainfall time series data. However, missing data in a time series such as this could result in lower quality data. Therefore, this paper proposes a multiple-imputation algorithm for treating missing data without requiring information from adjoining monitoring stations. The proposed imputation algorithms are based on the M-component probabilistic principal component analysis model and an expectation maximisation algorithm (MPPCA-EM). In order to evaluate the effectiveness of the MPPCA-EM imputation algorithm, six distinct historical daily rainfall time series data were recorded from six monitoring stations. These stations were located at the coastal and inland regions of the East-Coast Economic Region (ECER) Malaysia. The results of analysis show that, when it comes to treating missing historical daily rainfall time series data recorded from coastal monitoring stations, the 2-component probabilistic principal component analysis model and expectation-maximisation algorithm (2PPCA-EM) were found to be superior to the single- and multiple-imputation algorithms proposed in previous studies. On the contrary, the single-imputation algorithms as proposed in previous studies were superior to the MPPCA-EM imputation algorithms when treating missing historical daily rainfall time series data recorded from inland monitoring stations.
first_indexed 2024-03-06T12:47:16Z
format Article
id UMPir30291
institution Universiti Malaysia Pahang
language English
English
last_indexed 2024-03-06T12:47:16Z
publishDate 2020
publisher Korean Meteorological Society and Springer Nature
record_format dspace
spelling UMPir302912022-01-17T04:40:29Z http://umpir.ump.edu.my/id/eprint/30291/ The effectiveness of a probabilistic principal component analysis model and expectation maximisation algorithm in treating missing daily rainfall data Chuan, Zun Liang Sayang, Mohd Deni Fam, Soo-Fen Noriszura, Ismail GE Environmental Sciences The reliability and accuracy of a risk assessment of extreme hydro-meteorological events are highly dependent on the quality of the historical rainfall time series data. However, missing data in a time series such as this could result in lower quality data. Therefore, this paper proposes a multiple-imputation algorithm for treating missing data without requiring information from adjoining monitoring stations. The proposed imputation algorithms are based on the M-component probabilistic principal component analysis model and an expectation maximisation algorithm (MPPCA-EM). In order to evaluate the effectiveness of the MPPCA-EM imputation algorithm, six distinct historical daily rainfall time series data were recorded from six monitoring stations. These stations were located at the coastal and inland regions of the East-Coast Economic Region (ECER) Malaysia. The results of analysis show that, when it comes to treating missing historical daily rainfall time series data recorded from coastal monitoring stations, the 2-component probabilistic principal component analysis model and expectation-maximisation algorithm (2PPCA-EM) were found to be superior to the single- and multiple-imputation algorithms proposed in previous studies. On the contrary, the single-imputation algorithms as proposed in previous studies were superior to the MPPCA-EM imputation algorithms when treating missing historical daily rainfall time series data recorded from inland monitoring stations. Korean Meteorological Society and Springer Nature 2020 Article PeerReviewed pdf en http://umpir.ump.edu.my/id/eprint/30291/1/The%20Effectiveness%20of%20a%20Probabilistic%20Principal%20Component%20Analysis%20Model%20and%20Expectation%20Maximisation%20Algorithm%20in%20Treating%20Missing%20Daily%20Rainfall%20Data.pdf pdf en http://umpir.ump.edu.my/id/eprint/30291/7/The%20Effectiveness%20of%20a%20Probabilistic%20Principal%20Component%20Analysis.pdf Chuan, Zun Liang and Sayang, Mohd Deni and Fam, Soo-Fen and Noriszura, Ismail (2020) The effectiveness of a probabilistic principal component analysis model and expectation maximisation algorithm in treating missing daily rainfall data. Asia-Pacific Journal of Atmospheric Sciences, 56 (1). pp. 119-129. ISSN 1976-7633. (Published) https://doi.org/10.1007/s13143-019-00135-8 https://doi.org/10.1007/s13143-019-00135-8
spellingShingle GE Environmental Sciences
Chuan, Zun Liang
Sayang, Mohd Deni
Fam, Soo-Fen
Noriszura, Ismail
The effectiveness of a probabilistic principal component analysis model and expectation maximisation algorithm in treating missing daily rainfall data
title The effectiveness of a probabilistic principal component analysis model and expectation maximisation algorithm in treating missing daily rainfall data
title_full The effectiveness of a probabilistic principal component analysis model and expectation maximisation algorithm in treating missing daily rainfall data
title_fullStr The effectiveness of a probabilistic principal component analysis model and expectation maximisation algorithm in treating missing daily rainfall data
title_full_unstemmed The effectiveness of a probabilistic principal component analysis model and expectation maximisation algorithm in treating missing daily rainfall data
title_short The effectiveness of a probabilistic principal component analysis model and expectation maximisation algorithm in treating missing daily rainfall data
title_sort effectiveness of a probabilistic principal component analysis model and expectation maximisation algorithm in treating missing daily rainfall data
topic GE Environmental Sciences
url http://umpir.ump.edu.my/id/eprint/30291/1/The%20Effectiveness%20of%20a%20Probabilistic%20Principal%20Component%20Analysis%20Model%20and%20Expectation%20Maximisation%20Algorithm%20in%20Treating%20Missing%20Daily%20Rainfall%20Data.pdf
http://umpir.ump.edu.my/id/eprint/30291/7/The%20Effectiveness%20of%20a%20Probabilistic%20Principal%20Component%20Analysis.pdf
work_keys_str_mv AT chuanzunliang theeffectivenessofaprobabilisticprincipalcomponentanalysismodelandexpectationmaximisationalgorithmintreatingmissingdailyrainfalldata
AT sayangmohddeni theeffectivenessofaprobabilisticprincipalcomponentanalysismodelandexpectationmaximisationalgorithmintreatingmissingdailyrainfalldata
AT famsoofen theeffectivenessofaprobabilisticprincipalcomponentanalysismodelandexpectationmaximisationalgorithmintreatingmissingdailyrainfalldata
AT noriszuraismail theeffectivenessofaprobabilisticprincipalcomponentanalysismodelandexpectationmaximisationalgorithmintreatingmissingdailyrainfalldata
AT chuanzunliang effectivenessofaprobabilisticprincipalcomponentanalysismodelandexpectationmaximisationalgorithmintreatingmissingdailyrainfalldata
AT sayangmohddeni effectivenessofaprobabilisticprincipalcomponentanalysismodelandexpectationmaximisationalgorithmintreatingmissingdailyrainfalldata
AT famsoofen effectivenessofaprobabilisticprincipalcomponentanalysismodelandexpectationmaximisationalgorithmintreatingmissingdailyrainfalldata
AT noriszuraismail effectivenessofaprobabilisticprincipalcomponentanalysismodelandexpectationmaximisationalgorithmintreatingmissingdailyrainfalldata