Assessment of Internal Validity of Prognostic Models through Bootstrapping and Multiple Imputation of Missing Data
Background: Prognostic models have clinical appeal to aid therapeutic decision making. Two main practical challenges in development of such models are assessment of validity of models and imputation of missing data. In this study, importance of imputation of missing data and application of bootstrap...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Tehran University of Medical Sciences
2012-05-01
|
Series: | Iranian Journal of Public Health |
Subjects: | |
Online Access: | https://ijph.tums.ac.ir/index.php/ijph/article/view/2581 |
Summary: | Background: Prognostic models have clinical appeal to aid therapeutic decision making. Two main practical challenges in development of such models are assessment of validity of models and imputation of missing data. In this study, importance of imputation of missing data and application of bootstrap technique in development, simplification, and assessment of internal validity of a prognostic model is highlighted.
Methods: Overall, 310 breast cancer patients were recruited. Missing data were imputed 10 times. Then to deal with sensitivity of the model due to small changes in the data (internal validity), 100 bootstrap samples were drawn from each of 10 imputed data sets leading to 1000 samples. A Cox regression model was fitted to each of 1000 samples. Only variables retained in more than 50% of samples were used in development of final model.
Results: Four variables retained significant in more than 50% (i.e. 500 samples) of bootstrap samples; tumour size (91%), tumour grade (64%), history of benign breast disease (77%), and age at diagnosis (59%). Tumour size was the strongest predictor with inclusion frequency exceeding 90%. Number of deliveries was correlated with age at diagnosis (r=0.35, P<0.001). These two variables together retained significant in more than 90% of samples.
Conclusion: We addressed two important methodological issues using a cohort of breast cancer patients. The algorithm combines multiple imputation of missing data and bootstrapping and has the potential to be applied in all kind of regression modelling exercises so as to address internal validity of models. |
---|---|
ISSN: | 2251-6085 2251-6093 |