Deep Learning and Machine Learning with Grid Search to Predict Later Occurrence of Breast Cancer Metastasis Using Clinical Data

Background: It is important to be able to predict, for each individual patient, the likelihood of later metastatic occurrence, because the prediction can guide treatment plans tailored to a specific patient to prevent metastasis and to help avoid under-treatment or over-treatment. Deep neural networ...

Full description

Bibliographic Details
Main Authors:	Xia Jiang, Chuhan Xu
Format:	Article
Language:	English
Published:	MDPI AG 2022-09-01
Series:	Journal of Clinical Medicine
Subjects:	deep learning DNN machine learning breast cancer metastasis metastatic breast cancer
Online Access:	https://www.mdpi.com/2077-0383/11/19/5772

_version_	1797478611074154496
author	Xia Jiang Chuhan Xu
author_facet	Xia Jiang Chuhan Xu
author_sort	Xia Jiang
collection	DOAJ
description	Background: It is important to be able to predict, for each individual patient, the likelihood of later metastatic occurrence, because the prediction can guide treatment plans tailored to a specific patient to prevent metastasis and to help avoid under-treatment or over-treatment. Deep neural network (DNN) learning, commonly referred to as deep learning, has become popular due to its success in image detection and prediction, but questions such as whether deep learning outperforms other machine learning methods when using non-image clinical data remain unanswered. Grid search has been introduced to deep learning hyperparameter tuning for the purpose of improving its prediction performance, but the effect of grid search on other machine learning methods are under-studied. In this research, we take the empirical approach to study the performance of deep learning and other machine learning methods when using non-image clinical data to predict the occurrence of breast cancer metastasis (BCM) 5, 10, or 15 years after the initial treatment. We developed prediction models using the deep feedforward neural network (DFNN) methods, as well as models using nine other machine learning methods, including naïve Bayes (NB), logistic regression (LR), support vector machine (SVM), LASSO, decision tree (DT), k-nearest neighbor (KNN), random forest (RF), AdaBoost (ADB), and XGBoost (XGB). We used grid search to tune hyperparameters for all methods. We then compared our feedforward deep learning models to the models trained using the nine other machine learning methods. Results: Based on the mean test AUC (Area under the ROC Curve) results, DFNN ranks 6th, 4th, and 3rd when predicting 5-year, 10-year, and 15-year BCM, respectively, out of 10 methods. The top performing methods in predicting 5-year BCM are XGB (1st), RF (2nd), and KNN (3rd). For predicting 10-year BCM, the top performers are XGB (1st), RF (2nd), and NB (3rd). Finally, for 15-year BCM, the top performers are SVM (1st), LR and LASSO (tied for 2nd), and DFNN (3rd). The ensemble methods RF and XGB outperform other methods when data are less balanced, while SVM, LR, LASSO, and DFNN outperform other methods when data are more balanced. Our statistical testing results show that at a significance level of 0.05, DFNN overall performs comparably to other machine learning methods when predicting 5-year, 10-year, and 15-year BCM. Conclusions: Our results show that deep learning with grid search overall performs at least as well as other machine learning methods when using non-image clinical data. It is interesting to note that some of the other machine learning methods, such as XGB, RF, and SVM, are very strong competitors of DFNN when incorporating grid search. It is also worth noting that the computation time required to do grid search with DFNN is much more than that required to do grid search with the other nine machine learning methods.
first_indexed	2024-03-09T21:34:10Z
format	Article
id	doaj.art-3b8bc2fd814046338367fd10e4c21f32
institution	Directory Open Access Journal
issn	2077-0383
language	English
last_indexed	2024-03-09T21:34:10Z
publishDate	2022-09-01
publisher	MDPI AG
record_format	Article
series	Journal of Clinical Medicine
spelling	doaj.art-3b8bc2fd814046338367fd10e4c21f322023-11-23T20:48:31ZengMDPI AGJournal of Clinical Medicine2077-03832022-09-011119577210.3390/jcm11195772Deep Learning and Machine Learning with Grid Search to Predict Later Occurrence of Breast Cancer Metastasis Using Clinical DataXia Jiang0Chuhan Xu1Department of Biomedical Informatics, SOM, University of Pittsburgh, Pittsburgh, PA 15217, USADepartment of Biomedical Informatics, SOM, University of Pittsburgh, Pittsburgh, PA 15217, USABackground: It is important to be able to predict, for each individual patient, the likelihood of later metastatic occurrence, because the prediction can guide treatment plans tailored to a specific patient to prevent metastasis and to help avoid under-treatment or over-treatment. Deep neural network (DNN) learning, commonly referred to as deep learning, has become popular due to its success in image detection and prediction, but questions such as whether deep learning outperforms other machine learning methods when using non-image clinical data remain unanswered. Grid search has been introduced to deep learning hyperparameter tuning for the purpose of improving its prediction performance, but the effect of grid search on other machine learning methods are under-studied. In this research, we take the empirical approach to study the performance of deep learning and other machine learning methods when using non-image clinical data to predict the occurrence of breast cancer metastasis (BCM) 5, 10, or 15 years after the initial treatment. We developed prediction models using the deep feedforward neural network (DFNN) methods, as well as models using nine other machine learning methods, including naïve Bayes (NB), logistic regression (LR), support vector machine (SVM), LASSO, decision tree (DT), k-nearest neighbor (KNN), random forest (RF), AdaBoost (ADB), and XGBoost (XGB). We used grid search to tune hyperparameters for all methods. We then compared our feedforward deep learning models to the models trained using the nine other machine learning methods. Results: Based on the mean test AUC (Area under the ROC Curve) results, DFNN ranks 6th, 4th, and 3rd when predicting 5-year, 10-year, and 15-year BCM, respectively, out of 10 methods. The top performing methods in predicting 5-year BCM are XGB (1st), RF (2nd), and KNN (3rd). For predicting 10-year BCM, the top performers are XGB (1st), RF (2nd), and NB (3rd). Finally, for 15-year BCM, the top performers are SVM (1st), LR and LASSO (tied for 2nd), and DFNN (3rd). The ensemble methods RF and XGB outperform other methods when data are less balanced, while SVM, LR, LASSO, and DFNN outperform other methods when data are more balanced. Our statistical testing results show that at a significance level of 0.05, DFNN overall performs comparably to other machine learning methods when predicting 5-year, 10-year, and 15-year BCM. Conclusions: Our results show that deep learning with grid search overall performs at least as well as other machine learning methods when using non-image clinical data. It is interesting to note that some of the other machine learning methods, such as XGB, RF, and SVM, are very strong competitors of DFNN when incorporating grid search. It is also worth noting that the computation time required to do grid search with DFNN is much more than that required to do grid search with the other nine machine learning methods.https://www.mdpi.com/2077-0383/11/19/5772deep learningDNNmachine learningbreast cancermetastasismetastatic breast cancer
spellingShingle	Xia Jiang Chuhan Xu Deep Learning and Machine Learning with Grid Search to Predict Later Occurrence of Breast Cancer Metastasis Using Clinical Data Journal of Clinical Medicine deep learning DNN machine learning breast cancer metastasis metastatic breast cancer
title	Deep Learning and Machine Learning with Grid Search to Predict Later Occurrence of Breast Cancer Metastasis Using Clinical Data
title_full	Deep Learning and Machine Learning with Grid Search to Predict Later Occurrence of Breast Cancer Metastasis Using Clinical Data
title_fullStr	Deep Learning and Machine Learning with Grid Search to Predict Later Occurrence of Breast Cancer Metastasis Using Clinical Data
title_full_unstemmed	Deep Learning and Machine Learning with Grid Search to Predict Later Occurrence of Breast Cancer Metastasis Using Clinical Data
title_short	Deep Learning and Machine Learning with Grid Search to Predict Later Occurrence of Breast Cancer Metastasis Using Clinical Data
title_sort	deep learning and machine learning with grid search to predict later occurrence of breast cancer metastasis using clinical data
topic	deep learning DNN machine learning breast cancer metastasis metastatic breast cancer
url	https://www.mdpi.com/2077-0383/11/19/5772
work_keys_str_mv	AT xiajiang deeplearningandmachinelearningwithgridsearchtopredictlateroccurrenceofbreastcancermetastasisusingclinicaldata AT chuhanxu deeplearningandmachinelearningwithgridsearchtopredictlateroccurrenceofbreastcancermetastasisusingclinicaldata

Deep Learning and Machine Learning with Grid Search to Predict Later Occurrence of Breast Cancer Metastasis Using Clinical Data

Similar Items