Imputation of precipitation data in northeast Brazil
Abstract This article evaluates four statistical methods of multiple imputation to fill in the missing data of daily precipitation in Northeast Brazil (NEB). We used a daily database collected by 94 rain gauges distributed in NEB from January 1, 1986 to December 31, 2015. The methods were: random sa...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Academia Brasileira de Ciências
2023-06-01
|
Series: | Anais da Academia Brasileira de Ciências |
Subjects: | |
Online Access: | http://www.scielo.br/scielo.php?script=sci_arttext&pid=S0001-37652023000301101&lng=en&tlng=en |
_version_ | 1797810380598149120 |
---|---|
author | DANIELE T. RODRIGUES WEBER A. GONÇALVES CLÁUDIO MOISÉS S. E SILVA MARIA HELENA C. SPYRIDES PAULO SÉRGIO LÚCIO |
author_facet | DANIELE T. RODRIGUES WEBER A. GONÇALVES CLÁUDIO MOISÉS S. E SILVA MARIA HELENA C. SPYRIDES PAULO SÉRGIO LÚCIO |
author_sort | DANIELE T. RODRIGUES |
collection | DOAJ |
description | Abstract This article evaluates four statistical methods of multiple imputation to fill in the missing data of daily precipitation in Northeast Brazil (NEB). We used a daily database collected by 94 rain gauges distributed in NEB from January 1, 1986 to December 31, 2015. The methods were: random sampling from the observed values; predictive mean matching, Bayesian linear regression; and bootstrap expectation maximization algorithm (BootEm). To compare these methods, missing data from the original series were initially excluded. The next step was to create three scenarios for each method, in which 10\%, 20\% and 30\% of the data were removed at random. The BootEM method presented the best statistical results. With the average bias between the complete series and the imputed series values ranging between -0.91 and 1.30 mm/day. The values of the Pearson correlation ranging between 0.96, 0.91 and 0.86 respectively for 10\%, 20\% and 30\% missing data. We conclude that this is an adequate method for the reconstruction of historical precipitation data in NEB. |
first_indexed | 2024-03-13T07:07:58Z |
format | Article |
id | doaj.art-cbb0532cb0ae4a39a87dea866979efad |
institution | Directory Open Access Journal |
issn | 1678-2690 |
language | English |
last_indexed | 2024-03-13T07:07:58Z |
publishDate | 2023-06-01 |
publisher | Academia Brasileira de Ciências |
record_format | Article |
series | Anais da Academia Brasileira de Ciências |
spelling | doaj.art-cbb0532cb0ae4a39a87dea866979efad2023-06-06T07:36:18ZengAcademia Brasileira de CiênciasAnais da Academia Brasileira de Ciências1678-26902023-06-0195210.1590/0001-3765202320210737Imputation of precipitation data in northeast BrazilDANIELE T. RODRIGUEShttps://orcid.org/0000-0003-4307-2832WEBER A. GONÇALVEShttps://orcid.org/0000-0002-5073-8527CLÁUDIO MOISÉS S. E SILVAhttps://orcid.org/0000-0002-2251-7348MARIA HELENA C. SPYRIDEShttps://orcid.org/0000-0001-8087-1962PAULO SÉRGIO LÚCIOhttps://orcid.org/0000-0002-8170-934XAbstract This article evaluates four statistical methods of multiple imputation to fill in the missing data of daily precipitation in Northeast Brazil (NEB). We used a daily database collected by 94 rain gauges distributed in NEB from January 1, 1986 to December 31, 2015. The methods were: random sampling from the observed values; predictive mean matching, Bayesian linear regression; and bootstrap expectation maximization algorithm (BootEm). To compare these methods, missing data from the original series were initially excluded. The next step was to create three scenarios for each method, in which 10\%, 20\% and 30\% of the data were removed at random. The BootEM method presented the best statistical results. With the average bias between the complete series and the imputed series values ranging between -0.91 and 1.30 mm/day. The values of the Pearson correlation ranging between 0.96, 0.91 and 0.86 respectively for 10\%, 20\% and 30\% missing data. We conclude that this is an adequate method for the reconstruction of historical precipitation data in NEB.http://www.scielo.br/scielo.php?script=sci_arttext&pid=S0001-37652023000301101&lng=en&tlng=enBootstrapmissing datamultiple imputationsemiarid |
spellingShingle | DANIELE T. RODRIGUES WEBER A. GONÇALVES CLÁUDIO MOISÉS S. E SILVA MARIA HELENA C. SPYRIDES PAULO SÉRGIO LÚCIO Imputation of precipitation data in northeast Brazil Anais da Academia Brasileira de Ciências Bootstrap missing data multiple imputation semiarid |
title | Imputation of precipitation data in northeast Brazil |
title_full | Imputation of precipitation data in northeast Brazil |
title_fullStr | Imputation of precipitation data in northeast Brazil |
title_full_unstemmed | Imputation of precipitation data in northeast Brazil |
title_short | Imputation of precipitation data in northeast Brazil |
title_sort | imputation of precipitation data in northeast brazil |
topic | Bootstrap missing data multiple imputation semiarid |
url | http://www.scielo.br/scielo.php?script=sci_arttext&pid=S0001-37652023000301101&lng=en&tlng=en |
work_keys_str_mv | AT danieletrodrigues imputationofprecipitationdatainnortheastbrazil AT weberagoncalves imputationofprecipitationdatainnortheastbrazil AT claudiomoisessesilva imputationofprecipitationdatainnortheastbrazil AT mariahelenacspyrides imputationofprecipitationdatainnortheastbrazil AT paulosergiolucio imputationofprecipitationdatainnortheastbrazil |