Survival prediction of glioblastoma patients using modern deep learning and machine learning techniques

Abstract In this study, we utilized data from the Surveillance, Epidemiology, and End Results (SEER) database to predict the glioblastoma patients’ survival outcomes. To assess dataset skewness and detect feature importance, we applied Pearson's second coefficient test of skewness and the Ordin...

Full description

Bibliographic Details
Main Authors:	Samin Babaei Rikan, Amir Sorayaie Azar, Amin Naemi, Jamshid Bagherzadeh Mohasefi, Habibollah Pirnejad, Uffe Kock Wiil
Format:	Article
Language:	English
Published:	Nature Portfolio 2024-01-01
Series:	Scientific Reports
Online Access:	https://doi.org/10.1038/s41598-024-53006-2

_version_	1797274702932082688
author	Samin Babaei Rikan Amir Sorayaie Azar Amin Naemi Jamshid Bagherzadeh Mohasefi Habibollah Pirnejad Uffe Kock Wiil
author_facet	Samin Babaei Rikan Amir Sorayaie Azar Amin Naemi Jamshid Bagherzadeh Mohasefi Habibollah Pirnejad Uffe Kock Wiil
author_sort	Samin Babaei Rikan
collection	DOAJ
description	Abstract In this study, we utilized data from the Surveillance, Epidemiology, and End Results (SEER) database to predict the glioblastoma patients’ survival outcomes. To assess dataset skewness and detect feature importance, we applied Pearson's second coefficient test of skewness and the Ordinary Least Squares method, respectively. Using two sampling strategies, holdout and five-fold cross-validation, we developed five machine learning (ML) models alongside a feed-forward deep neural network (DNN) for the multiclass classification and regression prediction of glioblastoma patient survival. After balancing the classification and regression datasets, we obtained 46,340 and 28,573 samples, respectively. Shapley additive explanations (SHAP) were then used to explain the decision-making process of the best model. In both classification and regression tasks, as well as across holdout and cross-validation sampling strategies, the DNN consistently outperformed the ML models. Notably, the accuracy were 90.25% and 90.22% for holdout and five-fold cross-validation, respectively, while the corresponding R2 values were 0.6565 and 0.6622. SHAP analysis revealed the importance of age at diagnosis as the most influential feature in the DNN's survival predictions. These findings suggest that the DNN holds promise as a practical auxiliary tool for clinicians, aiding them in optimal decision-making concerning the treatment and care trajectories for glioblastoma patients.
first_indexed	2024-03-07T15:02:04Z
format	Article
id	doaj.art-1ca083ad63994163a750984fcf1ca7ba
institution	Directory Open Access Journal
issn	2045-2322
language	English
last_indexed	2024-03-07T15:02:04Z
publishDate	2024-01-01
publisher	Nature Portfolio
record_format	Article
series	Scientific Reports
spelling	doaj.art-1ca083ad63994163a750984fcf1ca7ba2024-03-05T19:04:25ZengNature PortfolioScientific Reports2045-23222024-01-0114111210.1038/s41598-024-53006-2Survival prediction of glioblastoma patients using modern deep learning and machine learning techniquesSamin Babaei Rikan0Amir Sorayaie Azar1Amin Naemi2Jamshid Bagherzadeh Mohasefi3Habibollah Pirnejad4Uffe Kock Wiil5Department of Computer Engineering, Urmia UniversityDepartment of Computer Engineering, Urmia UniversitySDU Health Informatics and Technology, The Maersk Mc-Kinney Moller Institute, University of Southern DenmarkDepartment of Computer Engineering, Urmia UniversityErasmus School of Health Policy and Management (ESHPM), Erasmus University RotterdamSDU Health Informatics and Technology, The Maersk Mc-Kinney Moller Institute, University of Southern DenmarkAbstract In this study, we utilized data from the Surveillance, Epidemiology, and End Results (SEER) database to predict the glioblastoma patients’ survival outcomes. To assess dataset skewness and detect feature importance, we applied Pearson's second coefficient test of skewness and the Ordinary Least Squares method, respectively. Using two sampling strategies, holdout and five-fold cross-validation, we developed five machine learning (ML) models alongside a feed-forward deep neural network (DNN) for the multiclass classification and regression prediction of glioblastoma patient survival. After balancing the classification and regression datasets, we obtained 46,340 and 28,573 samples, respectively. Shapley additive explanations (SHAP) were then used to explain the decision-making process of the best model. In both classification and regression tasks, as well as across holdout and cross-validation sampling strategies, the DNN consistently outperformed the ML models. Notably, the accuracy were 90.25% and 90.22% for holdout and five-fold cross-validation, respectively, while the corresponding R2 values were 0.6565 and 0.6622. SHAP analysis revealed the importance of age at diagnosis as the most influential feature in the DNN's survival predictions. These findings suggest that the DNN holds promise as a practical auxiliary tool for clinicians, aiding them in optimal decision-making concerning the treatment and care trajectories for glioblastoma patients.https://doi.org/10.1038/s41598-024-53006-2
spellingShingle	Samin Babaei Rikan Amir Sorayaie Azar Amin Naemi Jamshid Bagherzadeh Mohasefi Habibollah Pirnejad Uffe Kock Wiil Survival prediction of glioblastoma patients using modern deep learning and machine learning techniques Scientific Reports
title	Survival prediction of glioblastoma patients using modern deep learning and machine learning techniques
title_full	Survival prediction of glioblastoma patients using modern deep learning and machine learning techniques
title_fullStr	Survival prediction of glioblastoma patients using modern deep learning and machine learning techniques
title_full_unstemmed	Survival prediction of glioblastoma patients using modern deep learning and machine learning techniques
title_short	Survival prediction of glioblastoma patients using modern deep learning and machine learning techniques
title_sort	survival prediction of glioblastoma patients using modern deep learning and machine learning techniques
url	https://doi.org/10.1038/s41598-024-53006-2
work_keys_str_mv	AT saminbabaeirikan survivalpredictionofglioblastomapatientsusingmoderndeeplearningandmachinelearningtechniques AT amirsorayaieazar survivalpredictionofglioblastomapatientsusingmoderndeeplearningandmachinelearningtechniques AT aminnaemi survivalpredictionofglioblastomapatientsusingmoderndeeplearningandmachinelearningtechniques AT jamshidbagherzadehmohasefi survivalpredictionofglioblastomapatientsusingmoderndeeplearningandmachinelearningtechniques AT habibollahpirnejad survivalpredictionofglioblastomapatientsusingmoderndeeplearningandmachinelearningtechniques AT uffekockwiil survivalpredictionofglioblastomapatientsusingmoderndeeplearningandmachinelearningtechniques

Survival prediction of glioblastoma patients using modern deep learning and machine learning techniques

Similar Items