Methods for handling missing data in serially sampled sputum specimens for mycobacterial culture conversion calculation
Abstract Background The occurrence and timing of mycobacterial culture conversion is used as a proxy for tuberculosis treatment response. When researchers serially sample sputum during tuberculosis studies, contamination or missed visits leads to missing data points. Traditionally, this is managed b...
Main Authors: | , , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2022-11-01
|
Series: | BMC Medical Research Methodology |
Subjects: | |
Online Access: | https://doi.org/10.1186/s12874-022-01782-8 |
_version_ | 1811319611614822400 |
---|---|
author | Samantha Malatesta Isabelle R. Weir Sarah E. Weber Tara C. Bouton Tara Carney Danie Theron Bronwyn Myers C. Robert Horsburgh Robin M. Warren Karen R. Jacobson Laura F. White |
author_facet | Samantha Malatesta Isabelle R. Weir Sarah E. Weber Tara C. Bouton Tara Carney Danie Theron Bronwyn Myers C. Robert Horsburgh Robin M. Warren Karen R. Jacobson Laura F. White |
author_sort | Samantha Malatesta |
collection | DOAJ |
description | Abstract Background The occurrence and timing of mycobacterial culture conversion is used as a proxy for tuberculosis treatment response. When researchers serially sample sputum during tuberculosis studies, contamination or missed visits leads to missing data points. Traditionally, this is managed by ignoring missing data or simple carry-forward techniques. Statistically advanced multiple imputation methods potentially decrease bias and retain sample size and statistical power. Methods We analyzed data from 261 participants who provided weekly sputa for the first 12 weeks of tuberculosis treatment. We compared methods for handling missing data points in a longitudinal study with a time-to-event outcome. Our primary outcome was time to culture conversion, defined as two consecutive weeks with no Mycobacterium tuberculosis growth. Methods used to address missing data included: 1) available case analysis, 2) last observation carried forward, and 3) multiple imputation by fully conditional specification. For each method, we calculated the proportion culture converted and used survival analysis to estimate Kaplan-Meier curves, hazard ratios, and restricted mean survival times. We compared methods based on point estimates, confidence intervals, and conclusions to specific research questions. Results The three missing data methods lead to differences in the number of participants achieving conversion; 78 (32.8%) participants converted with available case analysis, 154 (64.7%) converted with last observation carried forward, and 184 (77.1%) converted with multiple imputation. Multiple imputation resulted in smaller point estimates than simple approaches with narrower confidence intervals. The adjusted hazard ratio for smear negative participants was 3.4 (95% CI 2.3, 5.1) using multiple imputation compared to 5.2 (95% CI 3.1, 8.7) using last observation carried forward and 5.0 (95% CI 2.4, 10.6) using available case analysis. Conclusion We showed that accounting for missing sputum data through multiple imputation, a statistically valid approach under certain conditions, can lead to different conclusions than naïve methods. Careful consideration for how to handle missing data must be taken and be pre-specified prior to analysis. We used data from a TB study to demonstrate these concepts, however, the methods we described are broadly applicable to longitudinal missing data. We provide valuable statistical guidance and code for researchers to appropriately handle missing data in longitudinal studies. |
first_indexed | 2024-04-13T12:46:00Z |
format | Article |
id | doaj.art-c90dabb11af54f9b953cd31967877118 |
institution | Directory Open Access Journal |
issn | 1471-2288 |
language | English |
last_indexed | 2024-04-13T12:46:00Z |
publishDate | 2022-11-01 |
publisher | BMC |
record_format | Article |
series | BMC Medical Research Methodology |
spelling | doaj.art-c90dabb11af54f9b953cd319678771182022-12-22T02:46:23ZengBMCBMC Medical Research Methodology1471-22882022-11-0122111110.1186/s12874-022-01782-8Methods for handling missing data in serially sampled sputum specimens for mycobacterial culture conversion calculationSamantha Malatesta0Isabelle R. Weir1Sarah E. Weber2Tara C. Bouton3Tara Carney4Danie Theron5Bronwyn Myers6C. Robert Horsburgh7Robin M. Warren8Karen R. Jacobson9Laura F. White10Department of Biostatistics, Boston University School of Public HealthCenter for Biostatistics in AIDS Research in the Department of Biostatistics, Harvard T.H. Chan School of Public HealthSection of Infectious Diseases, Boston Medical CenterSection of Infectious Diseases, Department of Medicine, Boston University School of MedicineAlcohol, Tobacco and Other Drug Research Unit, South African Medical Research CouncilBrewelskloof HospitalAlcohol, Tobacco and Other Drug Research Unit, South African Medical Research CouncilDepartment of Biostatistics, Boston University School of Public HealthDSI-NRF Centre of Excellence for Biomedical Tuberculosis Research and South African Medical Research Council Centre for Tuberculosis ResearchSection of Infectious Diseases, Department of Medicine, Boston University School of MedicineDepartment of Biostatistics, Boston University School of Public HealthAbstract Background The occurrence and timing of mycobacterial culture conversion is used as a proxy for tuberculosis treatment response. When researchers serially sample sputum during tuberculosis studies, contamination or missed visits leads to missing data points. Traditionally, this is managed by ignoring missing data or simple carry-forward techniques. Statistically advanced multiple imputation methods potentially decrease bias and retain sample size and statistical power. Methods We analyzed data from 261 participants who provided weekly sputa for the first 12 weeks of tuberculosis treatment. We compared methods for handling missing data points in a longitudinal study with a time-to-event outcome. Our primary outcome was time to culture conversion, defined as two consecutive weeks with no Mycobacterium tuberculosis growth. Methods used to address missing data included: 1) available case analysis, 2) last observation carried forward, and 3) multiple imputation by fully conditional specification. For each method, we calculated the proportion culture converted and used survival analysis to estimate Kaplan-Meier curves, hazard ratios, and restricted mean survival times. We compared methods based on point estimates, confidence intervals, and conclusions to specific research questions. Results The three missing data methods lead to differences in the number of participants achieving conversion; 78 (32.8%) participants converted with available case analysis, 154 (64.7%) converted with last observation carried forward, and 184 (77.1%) converted with multiple imputation. Multiple imputation resulted in smaller point estimates than simple approaches with narrower confidence intervals. The adjusted hazard ratio for smear negative participants was 3.4 (95% CI 2.3, 5.1) using multiple imputation compared to 5.2 (95% CI 3.1, 8.7) using last observation carried forward and 5.0 (95% CI 2.4, 10.6) using available case analysis. Conclusion We showed that accounting for missing sputum data through multiple imputation, a statistically valid approach under certain conditions, can lead to different conclusions than naïve methods. Careful consideration for how to handle missing data must be taken and be pre-specified prior to analysis. We used data from a TB study to demonstrate these concepts, however, the methods we described are broadly applicable to longitudinal missing data. We provide valuable statistical guidance and code for researchers to appropriately handle missing data in longitudinal studies.https://doi.org/10.1186/s12874-022-01782-8Longitudinal missing dataMultiple imputationSurvival analysisTuberculosisCulture conversion |
spellingShingle | Samantha Malatesta Isabelle R. Weir Sarah E. Weber Tara C. Bouton Tara Carney Danie Theron Bronwyn Myers C. Robert Horsburgh Robin M. Warren Karen R. Jacobson Laura F. White Methods for handling missing data in serially sampled sputum specimens for mycobacterial culture conversion calculation BMC Medical Research Methodology Longitudinal missing data Multiple imputation Survival analysis Tuberculosis Culture conversion |
title | Methods for handling missing data in serially sampled sputum specimens for mycobacterial culture conversion calculation |
title_full | Methods for handling missing data in serially sampled sputum specimens for mycobacterial culture conversion calculation |
title_fullStr | Methods for handling missing data in serially sampled sputum specimens for mycobacterial culture conversion calculation |
title_full_unstemmed | Methods for handling missing data in serially sampled sputum specimens for mycobacterial culture conversion calculation |
title_short | Methods for handling missing data in serially sampled sputum specimens for mycobacterial culture conversion calculation |
title_sort | methods for handling missing data in serially sampled sputum specimens for mycobacterial culture conversion calculation |
topic | Longitudinal missing data Multiple imputation Survival analysis Tuberculosis Culture conversion |
url | https://doi.org/10.1186/s12874-022-01782-8 |
work_keys_str_mv | AT samanthamalatesta methodsforhandlingmissingdatainseriallysampledsputumspecimensformycobacterialcultureconversioncalculation AT isabellerweir methodsforhandlingmissingdatainseriallysampledsputumspecimensformycobacterialcultureconversioncalculation AT saraheweber methodsforhandlingmissingdatainseriallysampledsputumspecimensformycobacterialcultureconversioncalculation AT taracbouton methodsforhandlingmissingdatainseriallysampledsputumspecimensformycobacterialcultureconversioncalculation AT taracarney methodsforhandlingmissingdatainseriallysampledsputumspecimensformycobacterialcultureconversioncalculation AT danietheron methodsforhandlingmissingdatainseriallysampledsputumspecimensformycobacterialcultureconversioncalculation AT bronwynmyers methodsforhandlingmissingdatainseriallysampledsputumspecimensformycobacterialcultureconversioncalculation AT croberthorsburgh methodsforhandlingmissingdatainseriallysampledsputumspecimensformycobacterialcultureconversioncalculation AT robinmwarren methodsforhandlingmissingdatainseriallysampledsputumspecimensformycobacterialcultureconversioncalculation AT karenrjacobson methodsforhandlingmissingdatainseriallysampledsputumspecimensformycobacterialcultureconversioncalculation AT laurafwhite methodsforhandlingmissingdatainseriallysampledsputumspecimensformycobacterialcultureconversioncalculation |