What impact do assumptions about missing data have on conclusions? A practical sensitivity analysis for a cancer survival registry

Abstract Background Within epidemiological and clinical research, missing data are a common issue and often over looked in publications. When the issue of missing observations is addressed it is usually assumed that the missing data are ‘missing at random’ (MAR). This assumption should be checked fo...

Full description

Bibliographic Details
Main Authors:	M. Smuk, J. R. Carpenter, T. P. Morris
Format:	Article
Language:	English
Published:	BMC 2017-02-01
Series:	BMC Medical Research Methodology
Subjects:	Missing data Pattern-mixture model Sensitivity analysis Elicitation Missing at random Missing not at random
Online Access:	http://link.springer.com/article/10.1186/s12874-017-0301-0

_version_	1818907877166809088
author	M. Smuk J. R. Carpenter T. P. Morris
author_facet	M. Smuk J. R. Carpenter T. P. Morris
author_sort	M. Smuk
collection	DOAJ
description	Abstract Background Within epidemiological and clinical research, missing data are a common issue and often over looked in publications. When the issue of missing observations is addressed it is usually assumed that the missing data are ‘missing at random’ (MAR). This assumption should be checked for plausibility, however it is untestable, thus inferences should be assessed for robustness to departures from missing at random. Methods We highlight the method of pattern mixture sensitivity analysis after multiple imputation using colorectal cancer data as an example. We focus on the Dukes’ stage variable which has the highest proportion of missing observations. First, we find the probability of being in each Dukes’ stage given the MAR imputed dataset. We use these probabilities in a questionnaire to elicit prior beliefs from experts on what they believe the probability would be in the missing data. The questionnaire responses are then used in a Dirichlet draw to create a Bayesian ‘missing not at random’ (MNAR) prior to impute the missing observations. The model of interest is applied and inferences are compared to those from the MAR imputed data. Results The inferences were largely insensitive to departure from MAR. Inferences under MNAR suggested a smaller association between Dukes’ stage and death, though the association remained positive and with similarly low p values. Conclusions We conclude by discussing the positives and negatives of our method and highlight the importance of making people aware of the need to test the MAR assumption.
first_indexed	2024-12-19T22:02:06Z
format	Article
id	doaj.art-ce0e99cabc884f20ad170ad10a71dcaa
institution	Directory Open Access Journal
issn	1471-2288
language	English
last_indexed	2024-12-19T22:02:06Z
publishDate	2017-02-01
publisher	BMC
record_format	Article
series	BMC Medical Research Methodology
spelling	doaj.art-ce0e99cabc884f20ad170ad10a71dcaa2022-12-21T20:04:07ZengBMCBMC Medical Research Methodology1471-22882017-02-011711710.1186/s12874-017-0301-0What impact do assumptions about missing data have on conclusions? A practical sensitivity analysis for a cancer survival registryM. Smuk0J. R. Carpenter1T. P. Morris2Centre for Psychiatry, Queen Mary University of London, Charterhouse SqaureMedical Statistics Department, London School of Hygiene and Tropical MedicineMedical Statistics Department, London School of Hygiene and Tropical MedicineAbstract Background Within epidemiological and clinical research, missing data are a common issue and often over looked in publications. When the issue of missing observations is addressed it is usually assumed that the missing data are ‘missing at random’ (MAR). This assumption should be checked for plausibility, however it is untestable, thus inferences should be assessed for robustness to departures from missing at random. Methods We highlight the method of pattern mixture sensitivity analysis after multiple imputation using colorectal cancer data as an example. We focus on the Dukes’ stage variable which has the highest proportion of missing observations. First, we find the probability of being in each Dukes’ stage given the MAR imputed dataset. We use these probabilities in a questionnaire to elicit prior beliefs from experts on what they believe the probability would be in the missing data. The questionnaire responses are then used in a Dirichlet draw to create a Bayesian ‘missing not at random’ (MNAR) prior to impute the missing observations. The model of interest is applied and inferences are compared to those from the MAR imputed data. Results The inferences were largely insensitive to departure from MAR. Inferences under MNAR suggested a smaller association between Dukes’ stage and death, though the association remained positive and with similarly low p values. Conclusions We conclude by discussing the positives and negatives of our method and highlight the importance of making people aware of the need to test the MAR assumption.http://link.springer.com/article/10.1186/s12874-017-0301-0Missing dataPattern-mixture modelSensitivity analysisElicitationMissing at randomMissing not at random
spellingShingle	M. Smuk J. R. Carpenter T. P. Morris What impact do assumptions about missing data have on conclusions? A practical sensitivity analysis for a cancer survival registry BMC Medical Research Methodology Missing data Pattern-mixture model Sensitivity analysis Elicitation Missing at random Missing not at random
title	What impact do assumptions about missing data have on conclusions? A practical sensitivity analysis for a cancer survival registry
title_full	What impact do assumptions about missing data have on conclusions? A practical sensitivity analysis for a cancer survival registry
title_fullStr	What impact do assumptions about missing data have on conclusions? A practical sensitivity analysis for a cancer survival registry
title_full_unstemmed	What impact do assumptions about missing data have on conclusions? A practical sensitivity analysis for a cancer survival registry
title_short	What impact do assumptions about missing data have on conclusions? A practical sensitivity analysis for a cancer survival registry
title_sort	what impact do assumptions about missing data have on conclusions a practical sensitivity analysis for a cancer survival registry
topic	Missing data Pattern-mixture model Sensitivity analysis Elicitation Missing at random Missing not at random
url	http://link.springer.com/article/10.1186/s12874-017-0301-0
work_keys_str_mv	AT msmuk whatimpactdoassumptionsaboutmissingdatahaveonconclusionsapracticalsensitivityanalysisforacancersurvivalregistry AT jrcarpenter whatimpactdoassumptionsaboutmissingdatahaveonconclusionsapracticalsensitivityanalysisforacancersurvivalregistry AT tpmorris whatimpactdoassumptionsaboutmissingdatahaveonconclusionsapracticalsensitivityanalysisforacancersurvivalregistry

What impact do assumptions about missing data have on conclusions? A practical sensitivity analysis for a cancer survival registry

Similar Items