Methods for handling missing data in serially sampled sputum specimens for mycobacterial culture conversion calculation

Abstract Background The occurrence and timing of mycobacterial culture conversion is used as a proxy for tuberculosis treatment response. When researchers serially sample sputum during tuberculosis studies, contamination or missed visits leads to missing data points. Traditionally, this is managed b...

Full description

Bibliographic Details
Main Authors: Samantha Malatesta, Isabelle R. Weir, Sarah E. Weber, Tara C. Bouton, Tara Carney, Danie Theron, Bronwyn Myers, C. Robert Horsburgh, Robin M. Warren, Karen R. Jacobson, Laura F. White
Format: Article
Language:English
Published: BMC 2022-11-01
Series:BMC Medical Research Methodology
Subjects:
Online Access:https://doi.org/10.1186/s12874-022-01782-8
_version_ 1811319611614822400
author Samantha Malatesta
Isabelle R. Weir
Sarah E. Weber
Tara C. Bouton
Tara Carney
Danie Theron
Bronwyn Myers
C. Robert Horsburgh
Robin M. Warren
Karen R. Jacobson
Laura F. White
author_facet Samantha Malatesta
Isabelle R. Weir
Sarah E. Weber
Tara C. Bouton
Tara Carney
Danie Theron
Bronwyn Myers
C. Robert Horsburgh
Robin M. Warren
Karen R. Jacobson
Laura F. White
author_sort Samantha Malatesta
collection DOAJ
description Abstract Background The occurrence and timing of mycobacterial culture conversion is used as a proxy for tuberculosis treatment response. When researchers serially sample sputum during tuberculosis studies, contamination or missed visits leads to missing data points. Traditionally, this is managed by ignoring missing data or simple carry-forward techniques. Statistically advanced multiple imputation methods potentially decrease bias and retain sample size and statistical power. Methods We analyzed data from 261 participants who provided weekly sputa for the first 12 weeks of tuberculosis treatment. We compared methods for handling missing data points in a longitudinal study with a time-to-event outcome. Our primary outcome was time to culture conversion, defined as two consecutive weeks with no Mycobacterium tuberculosis growth. Methods used to address missing data included: 1) available case analysis, 2) last observation carried forward, and 3) multiple imputation by fully conditional specification. For each method, we calculated the proportion culture converted and used survival analysis to estimate Kaplan-Meier curves, hazard ratios, and restricted mean survival times. We compared methods based on point estimates, confidence intervals, and conclusions to specific research questions. Results The three missing data methods lead to differences in the number of participants achieving conversion; 78 (32.8%) participants converted with available case analysis, 154 (64.7%) converted with last observation carried forward, and 184 (77.1%) converted with multiple imputation. Multiple imputation resulted in smaller point estimates than simple approaches with narrower confidence intervals. The adjusted hazard ratio for smear negative participants was 3.4 (95% CI 2.3, 5.1) using multiple imputation compared to 5.2 (95% CI 3.1, 8.7) using last observation carried forward and 5.0 (95% CI 2.4, 10.6) using available case analysis. Conclusion We showed that accounting for missing sputum data through multiple imputation, a statistically valid approach under certain conditions, can lead to different conclusions than naïve methods. Careful consideration for how to handle missing data must be taken and be pre-specified prior to analysis. We used data from a TB study to demonstrate these concepts, however, the methods we described are broadly applicable to longitudinal missing data. We provide valuable statistical guidance and code for researchers to appropriately handle missing data in longitudinal studies.
first_indexed 2024-04-13T12:46:00Z
format Article
id doaj.art-c90dabb11af54f9b953cd31967877118
institution Directory Open Access Journal
issn 1471-2288
language English
last_indexed 2024-04-13T12:46:00Z
publishDate 2022-11-01
publisher BMC
record_format Article
series BMC Medical Research Methodology
spelling doaj.art-c90dabb11af54f9b953cd319678771182022-12-22T02:46:23ZengBMCBMC Medical Research Methodology1471-22882022-11-0122111110.1186/s12874-022-01782-8Methods for handling missing data in serially sampled sputum specimens for mycobacterial culture conversion calculationSamantha Malatesta0Isabelle R. Weir1Sarah E. Weber2Tara C. Bouton3Tara Carney4Danie Theron5Bronwyn Myers6C. Robert Horsburgh7Robin M. Warren8Karen R. Jacobson9Laura F. White10Department of Biostatistics, Boston University School of Public HealthCenter for Biostatistics in AIDS Research in the Department of Biostatistics, Harvard T.H. Chan School of Public HealthSection of Infectious Diseases, Boston Medical CenterSection of Infectious Diseases, Department of Medicine, Boston University School of MedicineAlcohol, Tobacco and Other Drug Research Unit, South African Medical Research CouncilBrewelskloof HospitalAlcohol, Tobacco and Other Drug Research Unit, South African Medical Research CouncilDepartment of Biostatistics, Boston University School of Public HealthDSI-NRF Centre of Excellence for Biomedical Tuberculosis Research and South African Medical Research Council Centre for Tuberculosis ResearchSection of Infectious Diseases, Department of Medicine, Boston University School of MedicineDepartment of Biostatistics, Boston University School of Public HealthAbstract Background The occurrence and timing of mycobacterial culture conversion is used as a proxy for tuberculosis treatment response. When researchers serially sample sputum during tuberculosis studies, contamination or missed visits leads to missing data points. Traditionally, this is managed by ignoring missing data or simple carry-forward techniques. Statistically advanced multiple imputation methods potentially decrease bias and retain sample size and statistical power. Methods We analyzed data from 261 participants who provided weekly sputa for the first 12 weeks of tuberculosis treatment. We compared methods for handling missing data points in a longitudinal study with a time-to-event outcome. Our primary outcome was time to culture conversion, defined as two consecutive weeks with no Mycobacterium tuberculosis growth. Methods used to address missing data included: 1) available case analysis, 2) last observation carried forward, and 3) multiple imputation by fully conditional specification. For each method, we calculated the proportion culture converted and used survival analysis to estimate Kaplan-Meier curves, hazard ratios, and restricted mean survival times. We compared methods based on point estimates, confidence intervals, and conclusions to specific research questions. Results The three missing data methods lead to differences in the number of participants achieving conversion; 78 (32.8%) participants converted with available case analysis, 154 (64.7%) converted with last observation carried forward, and 184 (77.1%) converted with multiple imputation. Multiple imputation resulted in smaller point estimates than simple approaches with narrower confidence intervals. The adjusted hazard ratio for smear negative participants was 3.4 (95% CI 2.3, 5.1) using multiple imputation compared to 5.2 (95% CI 3.1, 8.7) using last observation carried forward and 5.0 (95% CI 2.4, 10.6) using available case analysis. Conclusion We showed that accounting for missing sputum data through multiple imputation, a statistically valid approach under certain conditions, can lead to different conclusions than naïve methods. Careful consideration for how to handle missing data must be taken and be pre-specified prior to analysis. We used data from a TB study to demonstrate these concepts, however, the methods we described are broadly applicable to longitudinal missing data. We provide valuable statistical guidance and code for researchers to appropriately handle missing data in longitudinal studies.https://doi.org/10.1186/s12874-022-01782-8Longitudinal missing dataMultiple imputationSurvival analysisTuberculosisCulture conversion
spellingShingle Samantha Malatesta
Isabelle R. Weir
Sarah E. Weber
Tara C. Bouton
Tara Carney
Danie Theron
Bronwyn Myers
C. Robert Horsburgh
Robin M. Warren
Karen R. Jacobson
Laura F. White
Methods for handling missing data in serially sampled sputum specimens for mycobacterial culture conversion calculation
BMC Medical Research Methodology
Longitudinal missing data
Multiple imputation
Survival analysis
Tuberculosis
Culture conversion
title Methods for handling missing data in serially sampled sputum specimens for mycobacterial culture conversion calculation
title_full Methods for handling missing data in serially sampled sputum specimens for mycobacterial culture conversion calculation
title_fullStr Methods for handling missing data in serially sampled sputum specimens for mycobacterial culture conversion calculation
title_full_unstemmed Methods for handling missing data in serially sampled sputum specimens for mycobacterial culture conversion calculation
title_short Methods for handling missing data in serially sampled sputum specimens for mycobacterial culture conversion calculation
title_sort methods for handling missing data in serially sampled sputum specimens for mycobacterial culture conversion calculation
topic Longitudinal missing data
Multiple imputation
Survival analysis
Tuberculosis
Culture conversion
url https://doi.org/10.1186/s12874-022-01782-8
work_keys_str_mv AT samanthamalatesta methodsforhandlingmissingdatainseriallysampledsputumspecimensformycobacterialcultureconversioncalculation
AT isabellerweir methodsforhandlingmissingdatainseriallysampledsputumspecimensformycobacterialcultureconversioncalculation
AT saraheweber methodsforhandlingmissingdatainseriallysampledsputumspecimensformycobacterialcultureconversioncalculation
AT taracbouton methodsforhandlingmissingdatainseriallysampledsputumspecimensformycobacterialcultureconversioncalculation
AT taracarney methodsforhandlingmissingdatainseriallysampledsputumspecimensformycobacterialcultureconversioncalculation
AT danietheron methodsforhandlingmissingdatainseriallysampledsputumspecimensformycobacterialcultureconversioncalculation
AT bronwynmyers methodsforhandlingmissingdatainseriallysampledsputumspecimensformycobacterialcultureconversioncalculation
AT croberthorsburgh methodsforhandlingmissingdatainseriallysampledsputumspecimensformycobacterialcultureconversioncalculation
AT robinmwarren methodsforhandlingmissingdatainseriallysampledsputumspecimensformycobacterialcultureconversioncalculation
AT karenrjacobson methodsforhandlingmissingdatainseriallysampledsputumspecimensformycobacterialcultureconversioncalculation
AT laurafwhite methodsforhandlingmissingdatainseriallysampledsputumspecimensformycobacterialcultureconversioncalculation