Machine learning for survival analysis in cancer research: A comparative study

Overview: Survival analysis is at the basis of every study in the field of cancer research. As every endeavor in this field aims primarily and eventually to improve patients’ survival time or reduce the potential for recurrence. This article presents a summary of some cancer survival analysis techni...

Full description

Bibliographic Details
Main Authors: Wafaa Tizi, Abdelaziz Berrado
Format: Article
Language:English
Published: Elsevier 2023-09-01
Series:Scientific African
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2468227623003356
_version_ 1797676236431949824
author Wafaa Tizi
Abdelaziz Berrado
author_facet Wafaa Tizi
Abdelaziz Berrado
author_sort Wafaa Tizi
collection DOAJ
description Overview: Survival analysis is at the basis of every study in the field of cancer research. As every endeavor in this field aims primarily and eventually to improve patients’ survival time or reduce the potential for recurrence. This article presents a summary of some cancer survival analysis techniques and an up-to-date overview of different implementations of Machine Learning in this area of research. This paper also presents an empirical comparison of selected statistical and Machine Learning approaches on different types of cancer medical datasets. Methods: In this paper we explore a selection of recent articles that: review the use of Machine Learning in cancer research and/or benchmark the different Machine Learning techniques used in cancer survival analysis. This search resulted in 12 papers that were selected following certain criteria. Our aim is to assess the importance of the use of Machine Learning for survival analysis in cancer research, compared to the statistical methods, and how different Machine Learning techniques may perform in different settings in the context of cancer survival analysis. The techniques were selected based on their popularity. Cox Proportional Hazards with Ridge penalty, Random Survival Forests, Gradient Boosting for Survival Analysis with a CoxPh loss function, linear and kernel Support Vector Machines were applied to 10 different cancer survival datasets. The mean Concordance Index and standard deviation were used to compare the performances of these techniques and the results of these implementations were summarized and analyzed for noticeable patterns or trends. Kaplan-Meier plots were used for the non-parametric survival analysis of the different datasets. Results: Cox Proportional Hazards delivers comparable results with Machine Learning techniques thanks to the Ridge penalty and the different methods for dealing with tied events but fails to produce results in higher dimensional datasets. All techniques benchmarked in the study had comparable performances. The use of prognostic tools when there is a mismatch between the patients and the populations used to train the models may not be advisable since each dataset provides a differently shaped survival curve even when presenting a similar cancer type.
first_indexed 2024-03-11T22:26:04Z
format Article
id doaj.art-9c1f7d760a9245e5a04350d0e8b1d73d
institution Directory Open Access Journal
issn 2468-2276
language English
last_indexed 2024-03-11T22:26:04Z
publishDate 2023-09-01
publisher Elsevier
record_format Article
series Scientific African
spelling doaj.art-9c1f7d760a9245e5a04350d0e8b1d73d2023-09-24T05:16:29ZengElsevierScientific African2468-22762023-09-0121e01880Machine learning for survival analysis in cancer research: A comparative studyWafaa Tizi0Abdelaziz Berrado1Corresponding author.; Equipe AMIPS - Ecole Mohammadia d'Ingénieurs, Mohammed V University in Rabat, Avenue Ibn Sina, BP765, Agdal, Rabat, MoroccoEquipe AMIPS - Ecole Mohammadia d'Ingénieurs, Mohammed V University in Rabat, Avenue Ibn Sina, BP765, Agdal, Rabat, MoroccoOverview: Survival analysis is at the basis of every study in the field of cancer research. As every endeavor in this field aims primarily and eventually to improve patients’ survival time or reduce the potential for recurrence. This article presents a summary of some cancer survival analysis techniques and an up-to-date overview of different implementations of Machine Learning in this area of research. This paper also presents an empirical comparison of selected statistical and Machine Learning approaches on different types of cancer medical datasets. Methods: In this paper we explore a selection of recent articles that: review the use of Machine Learning in cancer research and/or benchmark the different Machine Learning techniques used in cancer survival analysis. This search resulted in 12 papers that were selected following certain criteria. Our aim is to assess the importance of the use of Machine Learning for survival analysis in cancer research, compared to the statistical methods, and how different Machine Learning techniques may perform in different settings in the context of cancer survival analysis. The techniques were selected based on their popularity. Cox Proportional Hazards with Ridge penalty, Random Survival Forests, Gradient Boosting for Survival Analysis with a CoxPh loss function, linear and kernel Support Vector Machines were applied to 10 different cancer survival datasets. The mean Concordance Index and standard deviation were used to compare the performances of these techniques and the results of these implementations were summarized and analyzed for noticeable patterns or trends. Kaplan-Meier plots were used for the non-parametric survival analysis of the different datasets. Results: Cox Proportional Hazards delivers comparable results with Machine Learning techniques thanks to the Ridge penalty and the different methods for dealing with tied events but fails to produce results in higher dimensional datasets. All techniques benchmarked in the study had comparable performances. The use of prognostic tools when there is a mismatch between the patients and the populations used to train the models may not be advisable since each dataset provides a differently shaped survival curve even when presenting a similar cancer type.http://www.sciencedirect.com/science/article/pii/S2468227623003356Cancer survival predictionMachine learningSurvival analysisCancer datasetsPatient features
spellingShingle Wafaa Tizi
Abdelaziz Berrado
Machine learning for survival analysis in cancer research: A comparative study
Scientific African
Cancer survival prediction
Machine learning
Survival analysis
Cancer datasets
Patient features
title Machine learning for survival analysis in cancer research: A comparative study
title_full Machine learning for survival analysis in cancer research: A comparative study
title_fullStr Machine learning for survival analysis in cancer research: A comparative study
title_full_unstemmed Machine learning for survival analysis in cancer research: A comparative study
title_short Machine learning for survival analysis in cancer research: A comparative study
title_sort machine learning for survival analysis in cancer research a comparative study
topic Cancer survival prediction
Machine learning
Survival analysis
Cancer datasets
Patient features
url http://www.sciencedirect.com/science/article/pii/S2468227623003356
work_keys_str_mv AT wafaatizi machinelearningforsurvivalanalysisincancerresearchacomparativestudy
AT abdelazizberrado machinelearningforsurvivalanalysisincancerresearchacomparativestudy