Analyzing the Data Completeness of Patients’ Records Using a Random Variable Approach to Predict the Incompleteness of Electronic Health Records
The purpose of this article is to illustrate an investigation of methods that can be effectively used to predict the data incompleteness of a dataset. Here, the investigators have conceptualized data incompleteness as a random variable, with the overall goal behind experimentation providing a 360-de...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-10-01
|
Series: | Applied Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-3417/12/21/10746 |
_version_ | 1797469288643166208 |
---|---|
author | Varadraj P. Gurupur Paniz Abedin Sahar Hooshmand Muhammed Shelleh |
author_facet | Varadraj P. Gurupur Paniz Abedin Sahar Hooshmand Muhammed Shelleh |
author_sort | Varadraj P. Gurupur |
collection | DOAJ |
description | The purpose of this article is to illustrate an investigation of methods that can be effectively used to predict the data incompleteness of a dataset. Here, the investigators have conceptualized data incompleteness as a random variable, with the overall goal behind experimentation providing a 360-degree view of this concept conceptualizing incompleteness of a dataset both as a continuous, discrete random variable depending on the aspect of the required analysis. During the course of the experiments, the investigators have identified Kolomogorov–Smirnov goodness of fit, Mielke distribution, and beta distributions as key methods to analyze the incompleteness of a dataset for the datasets used for experimentation. A comparison of these methods with a mixture density network was also performed. Overall, the investigators have provided key insights into the use of methods and algorithms that can be used to predict data incompleteness and have provided a pathway for further explorations and prediction of data incompleteness. |
first_indexed | 2024-03-09T19:19:16Z |
format | Article |
id | doaj.art-9ad8ec5f00434514b4fc63d53bb2b6f0 |
institution | Directory Open Access Journal |
issn | 2076-3417 |
language | English |
last_indexed | 2024-03-09T19:19:16Z |
publishDate | 2022-10-01 |
publisher | MDPI AG |
record_format | Article |
series | Applied Sciences |
spelling | doaj.art-9ad8ec5f00434514b4fc63d53bb2b6f02023-11-24T03:32:09ZengMDPI AGApplied Sciences2076-34172022-10-0112211074610.3390/app122110746Analyzing the Data Completeness of Patients’ Records Using a Random Variable Approach to Predict the Incompleteness of Electronic Health RecordsVaradraj P. Gurupur0Paniz Abedin1Sahar Hooshmand2Muhammed Shelleh3School of Global Health Management and Informatics, University of Central Florida, Orlando, FL 32816, USADepartment of Computer Science, Florida Polytechnic University, Lakeland, FL 33805, USADepartment of Computer Science, California State University-Dominguez Hills, Carson, CA 90747, USADepartment of Computer Science, University of Central Florida, Orlando, FL 32816, USAThe purpose of this article is to illustrate an investigation of methods that can be effectively used to predict the data incompleteness of a dataset. Here, the investigators have conceptualized data incompleteness as a random variable, with the overall goal behind experimentation providing a 360-degree view of this concept conceptualizing incompleteness of a dataset both as a continuous, discrete random variable depending on the aspect of the required analysis. During the course of the experiments, the investigators have identified Kolomogorov–Smirnov goodness of fit, Mielke distribution, and beta distributions as key methods to analyze the incompleteness of a dataset for the datasets used for experimentation. A comparison of these methods with a mixture density network was also performed. Overall, the investigators have provided key insights into the use of methods and algorithms that can be used to predict data incompleteness and have provided a pathway for further explorations and prediction of data incompleteness.https://www.mdpi.com/2076-3417/12/21/10746health informaticsbig data modelsdata completenessprobability densityKolomogorov–Smirnov test |
spellingShingle | Varadraj P. Gurupur Paniz Abedin Sahar Hooshmand Muhammed Shelleh Analyzing the Data Completeness of Patients’ Records Using a Random Variable Approach to Predict the Incompleteness of Electronic Health Records Applied Sciences health informatics big data models data completeness probability density Kolomogorov–Smirnov test |
title | Analyzing the Data Completeness of Patients’ Records Using a Random Variable Approach to Predict the Incompleteness of Electronic Health Records |
title_full | Analyzing the Data Completeness of Patients’ Records Using a Random Variable Approach to Predict the Incompleteness of Electronic Health Records |
title_fullStr | Analyzing the Data Completeness of Patients’ Records Using a Random Variable Approach to Predict the Incompleteness of Electronic Health Records |
title_full_unstemmed | Analyzing the Data Completeness of Patients’ Records Using a Random Variable Approach to Predict the Incompleteness of Electronic Health Records |
title_short | Analyzing the Data Completeness of Patients’ Records Using a Random Variable Approach to Predict the Incompleteness of Electronic Health Records |
title_sort | analyzing the data completeness of patients records using a random variable approach to predict the incompleteness of electronic health records |
topic | health informatics big data models data completeness probability density Kolomogorov–Smirnov test |
url | https://www.mdpi.com/2076-3417/12/21/10746 |
work_keys_str_mv | AT varadrajpgurupur analyzingthedatacompletenessofpatientsrecordsusingarandomvariableapproachtopredicttheincompletenessofelectronichealthrecords AT panizabedin analyzingthedatacompletenessofpatientsrecordsusingarandomvariableapproachtopredicttheincompletenessofelectronichealthrecords AT saharhooshmand analyzingthedatacompletenessofpatientsrecordsusingarandomvariableapproachtopredicttheincompletenessofelectronichealthrecords AT muhammedshelleh analyzingthedatacompletenessofpatientsrecordsusingarandomvariableapproachtopredicttheincompletenessofelectronichealthrecords |