A Survey on Data Imputation Techniques: Water Distribution System as a Use Case

The presence of missing data is problematic in most quantitative research studies. Water distribution systems (WDSs) are not immune to this problem. In fact, missing data is an inherent feature of a WDS. There are various techniques and methods to address missing data ranging from simply deleting th...

Full description

Bibliographic Details
Main Authors: Muhammad S. Osman, Adnan M. Abu-Mahfouz, Philip R. Page
Format: Article
Language:English
Published: IEEE 2018-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8502041/
_version_ 1818927609978814464
author Muhammad S. Osman
Adnan M. Abu-Mahfouz
Philip R. Page
author_facet Muhammad S. Osman
Adnan M. Abu-Mahfouz
Philip R. Page
author_sort Muhammad S. Osman
collection DOAJ
description The presence of missing data is problematic in most quantitative research studies. Water distribution systems (WDSs) are not immune to this problem. In fact, missing data is an inherent feature of a WDS. There are various techniques and methods to address missing data ranging from simply deleting the data to using complex algorithms to impute missing data. This paper reviews the different imputation options available from traditional methods (such as deletion and single imputation) to more modern and advanced methods (such as multiple imputation, model-based procedures, and machine learning techniques). The concept, application, and qualitative advantages and disadvantages of these methods are discussed. In addition, a novel approach for selecting an applicable technique is presented. The approach is a “top-down bottom-up”two-prong approach for the selection of a data analysis and missing data technique. The bottom-up approach facilitates the top-down selection of a suitable technique by analyzing the data and narrowing down the selection options. As a use case, this paper also reviews techniques that are used to impute missing data in WDSs.
first_indexed 2024-12-20T03:15:45Z
format Article
id doaj.art-aa57e25562b24324bae82834ae2d2e97
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-20T03:15:45Z
publishDate 2018-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-aa57e25562b24324bae82834ae2d2e972022-12-21T19:55:21ZengIEEEIEEE Access2169-35362018-01-016632796329110.1109/ACCESS.2018.28772698502041A Survey on Data Imputation Techniques: Water Distribution System as a Use CaseMuhammad S. Osman0https://orcid.org/0000-0003-2335-1885Adnan M. Abu-Mahfouz1https://orcid.org/0000-0002-6413-3924Philip R. Page2Built Environment, Council for Scientific and Industrial Research, Pretoria, South AfricaModelling and Digital Science, Council for Scientific and Industrial Research, Pretoria, South AfricaBuilt Environment, Council for Scientific and Industrial Research, Pretoria, South AfricaThe presence of missing data is problematic in most quantitative research studies. Water distribution systems (WDSs) are not immune to this problem. In fact, missing data is an inherent feature of a WDS. There are various techniques and methods to address missing data ranging from simply deleting the data to using complex algorithms to impute missing data. This paper reviews the different imputation options available from traditional methods (such as deletion and single imputation) to more modern and advanced methods (such as multiple imputation, model-based procedures, and machine learning techniques). The concept, application, and qualitative advantages and disadvantages of these methods are discussed. In addition, a novel approach for selecting an applicable technique is presented. The approach is a “top-down bottom-up”two-prong approach for the selection of a data analysis and missing data technique. The bottom-up approach facilitates the top-down selection of a suitable technique by analyzing the data and narrowing down the selection options. As a use case, this paper also reviews techniques that are used to impute missing data in WDSs.https://ieeexplore.ieee.org/document/8502041/Data imputationdeletionmachine-learning methodsmissing datamodel based proceduresmultiple imputation
spellingShingle Muhammad S. Osman
Adnan M. Abu-Mahfouz
Philip R. Page
A Survey on Data Imputation Techniques: Water Distribution System as a Use Case
IEEE Access
Data imputation
deletion
machine-learning methods
missing data
model based procedures
multiple imputation
title A Survey on Data Imputation Techniques: Water Distribution System as a Use Case
title_full A Survey on Data Imputation Techniques: Water Distribution System as a Use Case
title_fullStr A Survey on Data Imputation Techniques: Water Distribution System as a Use Case
title_full_unstemmed A Survey on Data Imputation Techniques: Water Distribution System as a Use Case
title_short A Survey on Data Imputation Techniques: Water Distribution System as a Use Case
title_sort survey on data imputation techniques water distribution system as a use case
topic Data imputation
deletion
machine-learning methods
missing data
model based procedures
multiple imputation
url https://ieeexplore.ieee.org/document/8502041/
work_keys_str_mv AT muhammadsosman asurveyondataimputationtechniqueswaterdistributionsystemasausecase
AT adnanmabumahfouz asurveyondataimputationtechniqueswaterdistributionsystemasausecase
AT philiprpage asurveyondataimputationtechniqueswaterdistributionsystemasausecase
AT muhammadsosman surveyondataimputationtechniqueswaterdistributionsystemasausecase
AT adnanmabumahfouz surveyondataimputationtechniqueswaterdistributionsystemasausecase
AT philiprpage surveyondataimputationtechniqueswaterdistributionsystemasausecase