Prevention of Disease Complications through Diagnostic Models: How to Tackle the Problem of Missing Data?

Background: Diagnostic models are frequently used to assess the role of risk factors on disease complications, and therefore to avoid them. Missing data is an issue that challenges the model making. The aim of this study was to develop a diagnostic model to predict death in HIV/ AIDS patients when m...

Full description

Bibliographic Details
Main Authors: MR Baneshi, H Faramarzi, M Marzban
Format: Article
Language:English
Published: Tehran University of Medical Sciences 2012-01-01
Series:Iranian Journal of Public Health
Subjects:
Online Access:https://ijph.tums.ac.ir/index.php/ijph/article/view/2635
_version_ 1818982618541064192
author MR Baneshi
H Faramarzi
M Marzban
author_facet MR Baneshi
H Faramarzi
M Marzban
author_sort MR Baneshi
collection DOAJ
description Background: Diagnostic models are frequently used to assess the role of risk factors on disease complications, and therefore to avoid them. Missing data is an issue that challenges the model making. The aim of this study was to develop a diagnostic model to predict death in HIV/ AIDS patients when missing data exist. Methods: HIV patients (n=1460) referred to Voluntary Consoling and Testing Center (VCT) of Shiraz southern Iran during 2004-2009 were recruited. Univariate association between variables and death was assessed. Only variables which had univariate P< 0.25 were selected to be offered to the Multifactorial models. First, patients with missing data on candidate variables were deleted (C-C model). Then, applying Multivariable Imputation via Chained Equations (MICE), missing data were imputed. Logistic regression was fitted to C-C and imputed data sets (MICE model). Models were compared in terms of number of variables retained in the final model, width of confidence intervals, and discrimination ability. Result: About 22% of data were lost in C-C model. Number of variables retained in the C-C and MICE models was 2 and 6 respectively. Confidence Intervals (C.I.) corresponding to C-C model was wider than that of MICE. The MICE model showed greater discrimination ability than C-C model (70% versus 64%). Conclusion: The C-C analysis resulted to loss of power and wide CI's. Once missing data were imputed, more variables reached significance level and C.I.'s were narrower. Therefore, we do recommend the application of the imputation method for handling missing data.
first_indexed 2024-12-20T17:50:05Z
format Article
id doaj.art-fe7bd14f25ca414c99a023cdd4517796
institution Directory Open Access Journal
issn 2251-6085
2251-6093
language English
last_indexed 2024-12-20T17:50:05Z
publishDate 2012-01-01
publisher Tehran University of Medical Sciences
record_format Article
series Iranian Journal of Public Health
spelling doaj.art-fe7bd14f25ca414c99a023cdd45177962022-12-21T19:30:53ZengTehran University of Medical SciencesIranian Journal of Public Health2251-60852251-60932012-01-01411Prevention of Disease Complications through Diagnostic Models: How to Tackle the Problem of Missing Data?MR Baneshi0H Faramarzi1M Marzban2Reserch Center for Modeling in Health, Kerman University of Medical Sciences, Kerman, IranShiraz HIV/AIDS Research Center, Shiraz University of Medical Sciences, Shiraz Research Center for Traditional Medicine and History of Medicine, Shiraz University of Medical ScienBackground: Diagnostic models are frequently used to assess the role of risk factors on disease complications, and therefore to avoid them. Missing data is an issue that challenges the model making. The aim of this study was to develop a diagnostic model to predict death in HIV/ AIDS patients when missing data exist. Methods: HIV patients (n=1460) referred to Voluntary Consoling and Testing Center (VCT) of Shiraz southern Iran during 2004-2009 were recruited. Univariate association between variables and death was assessed. Only variables which had univariate P< 0.25 were selected to be offered to the Multifactorial models. First, patients with missing data on candidate variables were deleted (C-C model). Then, applying Multivariable Imputation via Chained Equations (MICE), missing data were imputed. Logistic regression was fitted to C-C and imputed data sets (MICE model). Models were compared in terms of number of variables retained in the final model, width of confidence intervals, and discrimination ability. Result: About 22% of data were lost in C-C model. Number of variables retained in the C-C and MICE models was 2 and 6 respectively. Confidence Intervals (C.I.) corresponding to C-C model was wider than that of MICE. The MICE model showed greater discrimination ability than C-C model (70% versus 64%). Conclusion: The C-C analysis resulted to loss of power and wide CI's. Once missing data were imputed, more variables reached significance level and C.I.'s were narrower. Therefore, we do recommend the application of the imputation method for handling missing data.https://ijph.tums.ac.ir/index.php/ijph/article/view/2635HIV/AIDSMissing DataImputationMICE
spellingShingle MR Baneshi
H Faramarzi
M Marzban
Prevention of Disease Complications through Diagnostic Models: How to Tackle the Problem of Missing Data?
Iranian Journal of Public Health
HIV/AIDS
Missing Data
Imputation
MICE
title Prevention of Disease Complications through Diagnostic Models: How to Tackle the Problem of Missing Data?
title_full Prevention of Disease Complications through Diagnostic Models: How to Tackle the Problem of Missing Data?
title_fullStr Prevention of Disease Complications through Diagnostic Models: How to Tackle the Problem of Missing Data?
title_full_unstemmed Prevention of Disease Complications through Diagnostic Models: How to Tackle the Problem of Missing Data?
title_short Prevention of Disease Complications through Diagnostic Models: How to Tackle the Problem of Missing Data?
title_sort prevention of disease complications through diagnostic models how to tackle the problem of missing data
topic HIV/AIDS
Missing Data
Imputation
MICE
url https://ijph.tums.ac.ir/index.php/ijph/article/view/2635
work_keys_str_mv AT mrbaneshi preventionofdiseasecomplicationsthroughdiagnosticmodelshowtotackletheproblemofmissingdata
AT hfaramarzi preventionofdiseasecomplicationsthroughdiagnosticmodelshowtotackletheproblemofmissingdata
AT mmarzban preventionofdiseasecomplicationsthroughdiagnosticmodelshowtotackletheproblemofmissingdata