Systematic review of using machine learning in imputing missing values

Missing data are a universal data quality problem in many domains, leading to misleading analysis and inaccurate decisions. Much research has been done to investigate the different mechanisms of missing data and the proper techniques in handling various data types. In the last decade, machine learni...

Full description

Bibliographic Details
Main Authors: Alabadla, Mustafa, Sidi, Fatimah, Ishak, Iskandar, Ibrahim, Hamidah, Affendey, Lilly Suriani, Che Ani, Zafienas, A. Jabar, Marzanah, Bukar, Umar Ali, Devaraj, Navin Kumar, Muda, Ahmad Sobri, Tharek, Anas, Omar, Noritah, Mohd Jaya, Mohd Izham
Format: Article
Published: Institute of Electrical and Electronics Engineers 2022
_version_ 1825938849427095552
author Alabadla, Mustafa
Sidi, Fatimah
Ishak, Iskandar
Ibrahim, Hamidah
Affendey, Lilly Suriani
Che Ani, Zafienas
A. Jabar, Marzanah
Bukar, Umar Ali
Devaraj, Navin Kumar
Muda, Ahmad Sobri
Tharek, Anas
Omar, Noritah
Mohd Jaya, Mohd Izham
author_facet Alabadla, Mustafa
Sidi, Fatimah
Ishak, Iskandar
Ibrahim, Hamidah
Affendey, Lilly Suriani
Che Ani, Zafienas
A. Jabar, Marzanah
Bukar, Umar Ali
Devaraj, Navin Kumar
Muda, Ahmad Sobri
Tharek, Anas
Omar, Noritah
Mohd Jaya, Mohd Izham
author_sort Alabadla, Mustafa
collection UPM
description Missing data are a universal data quality problem in many domains, leading to misleading analysis and inaccurate decisions. Much research has been done to investigate the different mechanisms of missing data and the proper techniques in handling various data types. In the last decade, machine learning has been utilized to replace conventional methods to address the problem of missing values more efficiently. By studying and analyzing recently proposed methods using machine learning approaches, vital adoptions in accuracy, performance, and time consumed can be highlighted. This study aimed to help data analysts and researchers address the limitations of machine learning imputation methods by conducting a systematic literature review to provide a comprehensive overview of using such methods to impute missing values. Novel proposed machine learning approaches used for data imputation are analyzed and summarized to assist researchers in selecting a proper machine learning method based on several factors and settings. The review was performed on research studies published between 2016 and 2021 on adopting machine learning to impute missing values, focusing on their strengths and limitations. A total of 684 research articles from various scientific databases were analyzed using search engines, and 94 of them were selected as primary studies. Finally, several recommendations were given to guide future researchers in applying machine learning to impute missing values.
first_indexed 2024-03-06T11:18:26Z
format Article
id upm.eprints-103422
institution Universiti Putra Malaysia
last_indexed 2024-03-06T11:18:26Z
publishDate 2022
publisher Institute of Electrical and Electronics Engineers
record_format dspace
spelling upm.eprints-1034222023-06-13T03:01:52Z http://psasir.upm.edu.my/id/eprint/103422/ Systematic review of using machine learning in imputing missing values Alabadla, Mustafa Sidi, Fatimah Ishak, Iskandar Ibrahim, Hamidah Affendey, Lilly Suriani Che Ani, Zafienas A. Jabar, Marzanah Bukar, Umar Ali Devaraj, Navin Kumar Muda, Ahmad Sobri Tharek, Anas Omar, Noritah Mohd Jaya, Mohd Izham Missing data are a universal data quality problem in many domains, leading to misleading analysis and inaccurate decisions. Much research has been done to investigate the different mechanisms of missing data and the proper techniques in handling various data types. In the last decade, machine learning has been utilized to replace conventional methods to address the problem of missing values more efficiently. By studying and analyzing recently proposed methods using machine learning approaches, vital adoptions in accuracy, performance, and time consumed can be highlighted. This study aimed to help data analysts and researchers address the limitations of machine learning imputation methods by conducting a systematic literature review to provide a comprehensive overview of using such methods to impute missing values. Novel proposed machine learning approaches used for data imputation are analyzed and summarized to assist researchers in selecting a proper machine learning method based on several factors and settings. The review was performed on research studies published between 2016 and 2021 on adopting machine learning to impute missing values, focusing on their strengths and limitations. A total of 684 research articles from various scientific databases were analyzed using search engines, and 94 of them were selected as primary studies. Finally, several recommendations were given to guide future researchers in applying machine learning to impute missing values. Institute of Electrical and Electronics Engineers 2022 Article PeerReviewed Alabadla, Mustafa and Sidi, Fatimah and Ishak, Iskandar and Ibrahim, Hamidah and Affendey, Lilly Suriani and Che Ani, Zafienas and A. Jabar, Marzanah and Bukar, Umar Ali and Devaraj, Navin Kumar and Muda, Ahmad Sobri and Tharek, Anas and Omar, Noritah and Mohd Jaya, Mohd Izham (2022) Systematic review of using machine learning in imputing missing values. IEEE Access, 10. 44483 - 44502. ISSN 2169-3536 https://ieeexplore.ieee.org/document/9762231 10.1109/ACCESS.2022.3160841
spellingShingle Alabadla, Mustafa
Sidi, Fatimah
Ishak, Iskandar
Ibrahim, Hamidah
Affendey, Lilly Suriani
Che Ani, Zafienas
A. Jabar, Marzanah
Bukar, Umar Ali
Devaraj, Navin Kumar
Muda, Ahmad Sobri
Tharek, Anas
Omar, Noritah
Mohd Jaya, Mohd Izham
Systematic review of using machine learning in imputing missing values
title Systematic review of using machine learning in imputing missing values
title_full Systematic review of using machine learning in imputing missing values
title_fullStr Systematic review of using machine learning in imputing missing values
title_full_unstemmed Systematic review of using machine learning in imputing missing values
title_short Systematic review of using machine learning in imputing missing values
title_sort systematic review of using machine learning in imputing missing values
work_keys_str_mv AT alabadlamustafa systematicreviewofusingmachinelearninginimputingmissingvalues
AT sidifatimah systematicreviewofusingmachinelearninginimputingmissingvalues
AT ishakiskandar systematicreviewofusingmachinelearninginimputingmissingvalues
AT ibrahimhamidah systematicreviewofusingmachinelearninginimputingmissingvalues
AT affendeylillysuriani systematicreviewofusingmachinelearninginimputingmissingvalues
AT cheanizafienas systematicreviewofusingmachinelearninginimputingmissingvalues
AT ajabarmarzanah systematicreviewofusingmachinelearninginimputingmissingvalues
AT bukarumarali systematicreviewofusingmachinelearninginimputingmissingvalues
AT devarajnavinkumar systematicreviewofusingmachinelearninginimputingmissingvalues
AT mudaahmadsobri systematicreviewofusingmachinelearninginimputingmissingvalues
AT tharekanas systematicreviewofusingmachinelearninginimputingmissingvalues
AT omarnoritah systematicreviewofusingmachinelearninginimputingmissingvalues
AT mohdjayamohdizham systematicreviewofusingmachinelearninginimputingmissingvalues