Prediction of lymph node metastasis in patients with breast invasive micropapillary carcinoma based on machine learning and SHapley Additive exPlanations framework

AbstractBackground and purpose: Machine learning (ML) is applied for outcome prediction and treatment support. This study aims to develop different ML models to predict risk of axillary lymph node metastasis (LNM) in breast invasive micropapillary carcinoma (IMPC) and to explore the risk factors of...

Full description

Bibliographic Details
Main Authors: Cong Jiang, Yuting Xiu, Kun Qiao, Xiao Yu, Shiyuan Zhang, Yuanxi Huang
Format: Article
Language:English
Published: Frontiers Media S.A. 2022-09-01
Series:Frontiers in Oncology
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fonc.2022.981059/full
_version_ 1798032898985558016
author Cong Jiang
Yuting Xiu
Kun Qiao
Xiao Yu
Shiyuan Zhang
Yuanxi Huang
author_facet Cong Jiang
Yuting Xiu
Kun Qiao
Xiao Yu
Shiyuan Zhang
Yuanxi Huang
author_sort Cong Jiang
collection DOAJ
description AbstractBackground and purpose: Machine learning (ML) is applied for outcome prediction and treatment support. This study aims to develop different ML models to predict risk of axillary lymph node metastasis (LNM) in breast invasive micropapillary carcinoma (IMPC) and to explore the risk factors of LNM.MethodsFrom the Surveillance, Epidemiology, and End Results (SEER) database and the records of our hospital, a total of 1547 patients diagnosed with breast IMPC were incorporated in this study. The ML model is built and the external validation is carried out. SHapley Additive exPlanations (SHAP) framework was applied to explain the optimal model; multivariable analysis was performed with logistic regression (LR); and nomograms were constructed according to the results of LR analysis.ResultsAge and tumor size were correlated with LNM in both cohorts. The luminal subtype is the most common in patients, with the tumor size <=20mm. Compared to other models, Xgboost was the best ML model with the biggest AUC of 0.813 (95% CI: 0.7994 - 0.8262) and the smallest Brier score of 0.186 (95% CI: 0.799-0.826). SHAP plots demonstrated that tumor size was the most vital risk factor for LNM. In both training and test sets, Xgboost had better AUC (0.761 vs 0.745; 0.813 vs 0.775; respectively), and it also achieved a smaller Brier score (0.202 vs 0.204; 0.186 vs 0.191; 0.220 vs 0.221; respectively) than the nomogram model based on LR in those three different sets. After adjusting for five most influential variables (tumor size, age, ER, HER-2, and PR), prediction score based on the Xgboost model was still correlated with LNM (adjusted OR:2.73, 95% CI: 1.30-5.71, P=0.008).ConclusionsThe Xgboost model outperforms the traditional LR-based nomogram model in predicting the LNM of IMPC patients. Combined with SHAP, it can more intuitively reflect the influence of different variables on the LNM. The tumor size was the most important risk factor of LNM for breast IMPC patients. The prediction score obtained by the Xgboost model could be a good indicator for LNM.
first_indexed 2024-04-11T20:21:18Z
format Article
id doaj.art-523f5c93f2db4aacb0c3b3cbf8c4aacd
institution Directory Open Access Journal
issn 2234-943X
language English
last_indexed 2024-04-11T20:21:18Z
publishDate 2022-09-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Oncology
spelling doaj.art-523f5c93f2db4aacb0c3b3cbf8c4aacd2022-12-22T04:04:48ZengFrontiers Media S.A.Frontiers in Oncology2234-943X2022-09-011210.3389/fonc.2022.981059981059Prediction of lymph node metastasis in patients with breast invasive micropapillary carcinoma based on machine learning and SHapley Additive exPlanations frameworkCong JiangYuting XiuKun QiaoXiao YuShiyuan ZhangYuanxi HuangAbstractBackground and purpose: Machine learning (ML) is applied for outcome prediction and treatment support. This study aims to develop different ML models to predict risk of axillary lymph node metastasis (LNM) in breast invasive micropapillary carcinoma (IMPC) and to explore the risk factors of LNM.MethodsFrom the Surveillance, Epidemiology, and End Results (SEER) database and the records of our hospital, a total of 1547 patients diagnosed with breast IMPC were incorporated in this study. The ML model is built and the external validation is carried out. SHapley Additive exPlanations (SHAP) framework was applied to explain the optimal model; multivariable analysis was performed with logistic regression (LR); and nomograms were constructed according to the results of LR analysis.ResultsAge and tumor size were correlated with LNM in both cohorts. The luminal subtype is the most common in patients, with the tumor size <=20mm. Compared to other models, Xgboost was the best ML model with the biggest AUC of 0.813 (95% CI: 0.7994 - 0.8262) and the smallest Brier score of 0.186 (95% CI: 0.799-0.826). SHAP plots demonstrated that tumor size was the most vital risk factor for LNM. In both training and test sets, Xgboost had better AUC (0.761 vs 0.745; 0.813 vs 0.775; respectively), and it also achieved a smaller Brier score (0.202 vs 0.204; 0.186 vs 0.191; 0.220 vs 0.221; respectively) than the nomogram model based on LR in those three different sets. After adjusting for five most influential variables (tumor size, age, ER, HER-2, and PR), prediction score based on the Xgboost model was still correlated with LNM (adjusted OR:2.73, 95% CI: 1.30-5.71, P=0.008).ConclusionsThe Xgboost model outperforms the traditional LR-based nomogram model in predicting the LNM of IMPC patients. Combined with SHAP, it can more intuitively reflect the influence of different variables on the LNM. The tumor size was the most important risk factor of LNM for breast IMPC patients. The prediction score obtained by the Xgboost model could be a good indicator for LNM.https://www.frontiersin.org/articles/10.3389/fonc.2022.981059/fullmachine learningSHAPIMPCnomogramlymph node metastasis
spellingShingle Cong Jiang
Yuting Xiu
Kun Qiao
Xiao Yu
Shiyuan Zhang
Yuanxi Huang
Prediction of lymph node metastasis in patients with breast invasive micropapillary carcinoma based on machine learning and SHapley Additive exPlanations framework
Frontiers in Oncology
machine learning
SHAP
IMPC
nomogram
lymph node metastasis
title Prediction of lymph node metastasis in patients with breast invasive micropapillary carcinoma based on machine learning and SHapley Additive exPlanations framework
title_full Prediction of lymph node metastasis in patients with breast invasive micropapillary carcinoma based on machine learning and SHapley Additive exPlanations framework
title_fullStr Prediction of lymph node metastasis in patients with breast invasive micropapillary carcinoma based on machine learning and SHapley Additive exPlanations framework
title_full_unstemmed Prediction of lymph node metastasis in patients with breast invasive micropapillary carcinoma based on machine learning and SHapley Additive exPlanations framework
title_short Prediction of lymph node metastasis in patients with breast invasive micropapillary carcinoma based on machine learning and SHapley Additive exPlanations framework
title_sort prediction of lymph node metastasis in patients with breast invasive micropapillary carcinoma based on machine learning and shapley additive explanations framework
topic machine learning
SHAP
IMPC
nomogram
lymph node metastasis
url https://www.frontiersin.org/articles/10.3389/fonc.2022.981059/full
work_keys_str_mv AT congjiang predictionoflymphnodemetastasisinpatientswithbreastinvasivemicropapillarycarcinomabasedonmachinelearningandshapleyadditiveexplanationsframework
AT yutingxiu predictionoflymphnodemetastasisinpatientswithbreastinvasivemicropapillarycarcinomabasedonmachinelearningandshapleyadditiveexplanationsframework
AT kunqiao predictionoflymphnodemetastasisinpatientswithbreastinvasivemicropapillarycarcinomabasedonmachinelearningandshapleyadditiveexplanationsframework
AT xiaoyu predictionoflymphnodemetastasisinpatientswithbreastinvasivemicropapillarycarcinomabasedonmachinelearningandshapleyadditiveexplanationsframework
AT shiyuanzhang predictionoflymphnodemetastasisinpatientswithbreastinvasivemicropapillarycarcinomabasedonmachinelearningandshapleyadditiveexplanationsframework
AT yuanxihuang predictionoflymphnodemetastasisinpatientswithbreastinvasivemicropapillarycarcinomabasedonmachinelearningandshapleyadditiveexplanationsframework