Feature selection through validation and un-censoring of endovascular repair survival data for predicting the risk of re-intervention

Abstract Background Feature selection (FS) process is essential in the medical area as it reduces the effort and time needed for physicians to measure unnecessary features. Choosing useful variables is a difficult task with the presence of censoring which is the unique characteristic in survival ana...

Full description

Bibliographic Details
Main Authors: Omneya Attallah, Alan Karthikesalingam, Peter J. E. Holt, Matthew M. Thompson, Rob Sayers, Matthew J. Bown, Eddie C. Choke, Xianghong Ma
Format: Article
Language:English
Published: BMC 2017-08-01
Series:BMC Medical Informatics and Decision Making
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12911-017-0508-3
_version_ 1818943132563144704
author Omneya Attallah
Alan Karthikesalingam
Peter J. E. Holt
Matthew M. Thompson
Rob Sayers
Matthew J. Bown
Eddie C. Choke
Xianghong Ma
author_facet Omneya Attallah
Alan Karthikesalingam
Peter J. E. Holt
Matthew M. Thompson
Rob Sayers
Matthew J. Bown
Eddie C. Choke
Xianghong Ma
author_sort Omneya Attallah
collection DOAJ
description Abstract Background Feature selection (FS) process is essential in the medical area as it reduces the effort and time needed for physicians to measure unnecessary features. Choosing useful variables is a difficult task with the presence of censoring which is the unique characteristic in survival analysis. Most survival FS methods depend on Cox’s proportional hazard model; however, machine learning techniques (MLT) are preferred but not commonly used due to censoring. Techniques that have been proposed to adopt MLT to perform FS with survival data cannot be used with the high level of censoring. The researcher’s previous publications proposed a technique to deal with the high level of censoring. It also used existing FS techniques to reduce dataset dimension. However, in this paper a new FS technique was proposed and combined with feature transformation and the proposed uncensoring approaches to select a reduced set of features and produce a stable predictive model. Methods In this paper, a FS technique based on artificial neural network (ANN) MLT is proposed to deal with highly censored Endovascular Aortic Repair (EVAR). Survival data EVAR datasets were collected during 2004 to 2010 from two vascular centers in order to produce a final stable model. They contain almost 91% of censored patients. The proposed approach used a wrapper FS method with ANN to select a reduced subset of features that predict the risk of EVAR re-intervention after 5 years to patients from two different centers located in the United Kingdom, to allow it to be potentially applied to cross-centers predictions. The proposed model is compared with the two popular FS techniques; Akaike and Bayesian information criteria (AIC, BIC) that are used with Cox’s model. Results The final model outperforms other methods in distinguishing the high and low risk groups; as they both have concordance index and estimated AUC better than the Cox’s model based on AIC, BIC, Lasso, and SCAD approaches. These models have p-values lower than 0.05, meaning that patients with different risk groups can be separated significantly and those who would need re-intervention can be correctly predicted. Conclusion The proposed approach will save time and effort made by physicians to collect unnecessary variables. The final reduced model was able to predict the long-term risk of aortic complications after EVAR. This predictive model can help clinicians decide patients’ future observation plan.
first_indexed 2024-12-20T07:22:28Z
format Article
id doaj.art-7229a7098d984d659ebe55febcabd25e
institution Directory Open Access Journal
issn 1472-6947
language English
last_indexed 2024-12-20T07:22:28Z
publishDate 2017-08-01
publisher BMC
record_format Article
series BMC Medical Informatics and Decision Making
spelling doaj.art-7229a7098d984d659ebe55febcabd25e2022-12-21T19:48:39ZengBMCBMC Medical Informatics and Decision Making1472-69472017-08-0117111910.1186/s12911-017-0508-3Feature selection through validation and un-censoring of endovascular repair survival data for predicting the risk of re-interventionOmneya Attallah0Alan Karthikesalingam1Peter J. E. Holt2Matthew M. Thompson3Rob Sayers4Matthew J. Bown5Eddie C. Choke6Xianghong Ma7School of Engineering and Applied Science, Aston UniversitySt George’s Vascular InstituteSt George’s Vascular InstituteSt George’s Vascular InstituteSt George’s Vascular Institute, St George’s University Hospitals NHS Foundation TrustVascular Surgery Group, University of LeicesterVascular Surgery Group, Robert Kilpatrick Clinical Sciences Building, Leicester Royal Infirmary, University of LeicesterSchool of Engineering and Applied Science, Aston UniversityAbstract Background Feature selection (FS) process is essential in the medical area as it reduces the effort and time needed for physicians to measure unnecessary features. Choosing useful variables is a difficult task with the presence of censoring which is the unique characteristic in survival analysis. Most survival FS methods depend on Cox’s proportional hazard model; however, machine learning techniques (MLT) are preferred but not commonly used due to censoring. Techniques that have been proposed to adopt MLT to perform FS with survival data cannot be used with the high level of censoring. The researcher’s previous publications proposed a technique to deal with the high level of censoring. It also used existing FS techniques to reduce dataset dimension. However, in this paper a new FS technique was proposed and combined with feature transformation and the proposed uncensoring approaches to select a reduced set of features and produce a stable predictive model. Methods In this paper, a FS technique based on artificial neural network (ANN) MLT is proposed to deal with highly censored Endovascular Aortic Repair (EVAR). Survival data EVAR datasets were collected during 2004 to 2010 from two vascular centers in order to produce a final stable model. They contain almost 91% of censored patients. The proposed approach used a wrapper FS method with ANN to select a reduced subset of features that predict the risk of EVAR re-intervention after 5 years to patients from two different centers located in the United Kingdom, to allow it to be potentially applied to cross-centers predictions. The proposed model is compared with the two popular FS techniques; Akaike and Bayesian information criteria (AIC, BIC) that are used with Cox’s model. Results The final model outperforms other methods in distinguishing the high and low risk groups; as they both have concordance index and estimated AUC better than the Cox’s model based on AIC, BIC, Lasso, and SCAD approaches. These models have p-values lower than 0.05, meaning that patients with different risk groups can be separated significantly and those who would need re-intervention can be correctly predicted. Conclusion The proposed approach will save time and effort made by physicians to collect unnecessary variables. The final reduced model was able to predict the long-term risk of aortic complications after EVAR. This predictive model can help clinicians decide patients’ future observation plan.http://link.springer.com/article/10.1186/s12911-017-0508-3Survival analysisCensoringFeature selectionModel selectionFactor analysisCox’s hazard proportional model
spellingShingle Omneya Attallah
Alan Karthikesalingam
Peter J. E. Holt
Matthew M. Thompson
Rob Sayers
Matthew J. Bown
Eddie C. Choke
Xianghong Ma
Feature selection through validation and un-censoring of endovascular repair survival data for predicting the risk of re-intervention
BMC Medical Informatics and Decision Making
Survival analysis
Censoring
Feature selection
Model selection
Factor analysis
Cox’s hazard proportional model
title Feature selection through validation and un-censoring of endovascular repair survival data for predicting the risk of re-intervention
title_full Feature selection through validation and un-censoring of endovascular repair survival data for predicting the risk of re-intervention
title_fullStr Feature selection through validation and un-censoring of endovascular repair survival data for predicting the risk of re-intervention
title_full_unstemmed Feature selection through validation and un-censoring of endovascular repair survival data for predicting the risk of re-intervention
title_short Feature selection through validation and un-censoring of endovascular repair survival data for predicting the risk of re-intervention
title_sort feature selection through validation and un censoring of endovascular repair survival data for predicting the risk of re intervention
topic Survival analysis
Censoring
Feature selection
Model selection
Factor analysis
Cox’s hazard proportional model
url http://link.springer.com/article/10.1186/s12911-017-0508-3
work_keys_str_mv AT omneyaattallah featureselectionthroughvalidationanduncensoringofendovascularrepairsurvivaldataforpredictingtheriskofreintervention
AT alankarthikesalingam featureselectionthroughvalidationanduncensoringofendovascularrepairsurvivaldataforpredictingtheriskofreintervention
AT peterjeholt featureselectionthroughvalidationanduncensoringofendovascularrepairsurvivaldataforpredictingtheriskofreintervention
AT matthewmthompson featureselectionthroughvalidationanduncensoringofendovascularrepairsurvivaldataforpredictingtheriskofreintervention
AT robsayers featureselectionthroughvalidationanduncensoringofendovascularrepairsurvivaldataforpredictingtheriskofreintervention
AT matthewjbown featureselectionthroughvalidationanduncensoringofendovascularrepairsurvivaldataforpredictingtheriskofreintervention
AT eddiecchoke featureselectionthroughvalidationanduncensoringofendovascularrepairsurvivaldataforpredictingtheriskofreintervention
AT xianghongma featureselectionthroughvalidationanduncensoringofendovascularrepairsurvivaldataforpredictingtheriskofreintervention