Prediction of lymph node metastasis in patients with breast invasive micropapillary carcinoma based on machine learning and SHapley Additive exPlanations framework
AbstractBackground and purpose: Machine learning (ML) is applied for outcome prediction and treatment support. This study aims to develop different ML models to predict risk of axillary lymph node metastasis (LNM) in breast invasive micropapillary carcinoma (IMPC) and to explore the risk factors of...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2022-09-01
|
Series: | Frontiers in Oncology |
Subjects: | |
Online Access: | https://www.frontiersin.org/articles/10.3389/fonc.2022.981059/full |
_version_ | 1798032898985558016 |
---|---|
author | Cong Jiang Yuting Xiu Kun Qiao Xiao Yu Shiyuan Zhang Yuanxi Huang |
author_facet | Cong Jiang Yuting Xiu Kun Qiao Xiao Yu Shiyuan Zhang Yuanxi Huang |
author_sort | Cong Jiang |
collection | DOAJ |
description | AbstractBackground and purpose: Machine learning (ML) is applied for outcome prediction and treatment support. This study aims to develop different ML models to predict risk of axillary lymph node metastasis (LNM) in breast invasive micropapillary carcinoma (IMPC) and to explore the risk factors of LNM.MethodsFrom the Surveillance, Epidemiology, and End Results (SEER) database and the records of our hospital, a total of 1547 patients diagnosed with breast IMPC were incorporated in this study. The ML model is built and the external validation is carried out. SHapley Additive exPlanations (SHAP) framework was applied to explain the optimal model; multivariable analysis was performed with logistic regression (LR); and nomograms were constructed according to the results of LR analysis.ResultsAge and tumor size were correlated with LNM in both cohorts. The luminal subtype is the most common in patients, with the tumor size <=20mm. Compared to other models, Xgboost was the best ML model with the biggest AUC of 0.813 (95% CI: 0.7994 - 0.8262) and the smallest Brier score of 0.186 (95% CI: 0.799-0.826). SHAP plots demonstrated that tumor size was the most vital risk factor for LNM. In both training and test sets, Xgboost had better AUC (0.761 vs 0.745; 0.813 vs 0.775; respectively), and it also achieved a smaller Brier score (0.202 vs 0.204; 0.186 vs 0.191; 0.220 vs 0.221; respectively) than the nomogram model based on LR in those three different sets. After adjusting for five most influential variables (tumor size, age, ER, HER-2, and PR), prediction score based on the Xgboost model was still correlated with LNM (adjusted OR:2.73, 95% CI: 1.30-5.71, P=0.008).ConclusionsThe Xgboost model outperforms the traditional LR-based nomogram model in predicting the LNM of IMPC patients. Combined with SHAP, it can more intuitively reflect the influence of different variables on the LNM. The tumor size was the most important risk factor of LNM for breast IMPC patients. The prediction score obtained by the Xgboost model could be a good indicator for LNM. |
first_indexed | 2024-04-11T20:21:18Z |
format | Article |
id | doaj.art-523f5c93f2db4aacb0c3b3cbf8c4aacd |
institution | Directory Open Access Journal |
issn | 2234-943X |
language | English |
last_indexed | 2024-04-11T20:21:18Z |
publishDate | 2022-09-01 |
publisher | Frontiers Media S.A. |
record_format | Article |
series | Frontiers in Oncology |
spelling | doaj.art-523f5c93f2db4aacb0c3b3cbf8c4aacd2022-12-22T04:04:48ZengFrontiers Media S.A.Frontiers in Oncology2234-943X2022-09-011210.3389/fonc.2022.981059981059Prediction of lymph node metastasis in patients with breast invasive micropapillary carcinoma based on machine learning and SHapley Additive exPlanations frameworkCong JiangYuting XiuKun QiaoXiao YuShiyuan ZhangYuanxi HuangAbstractBackground and purpose: Machine learning (ML) is applied for outcome prediction and treatment support. This study aims to develop different ML models to predict risk of axillary lymph node metastasis (LNM) in breast invasive micropapillary carcinoma (IMPC) and to explore the risk factors of LNM.MethodsFrom the Surveillance, Epidemiology, and End Results (SEER) database and the records of our hospital, a total of 1547 patients diagnosed with breast IMPC were incorporated in this study. The ML model is built and the external validation is carried out. SHapley Additive exPlanations (SHAP) framework was applied to explain the optimal model; multivariable analysis was performed with logistic regression (LR); and nomograms were constructed according to the results of LR analysis.ResultsAge and tumor size were correlated with LNM in both cohorts. The luminal subtype is the most common in patients, with the tumor size <=20mm. Compared to other models, Xgboost was the best ML model with the biggest AUC of 0.813 (95% CI: 0.7994 - 0.8262) and the smallest Brier score of 0.186 (95% CI: 0.799-0.826). SHAP plots demonstrated that tumor size was the most vital risk factor for LNM. In both training and test sets, Xgboost had better AUC (0.761 vs 0.745; 0.813 vs 0.775; respectively), and it also achieved a smaller Brier score (0.202 vs 0.204; 0.186 vs 0.191; 0.220 vs 0.221; respectively) than the nomogram model based on LR in those three different sets. After adjusting for five most influential variables (tumor size, age, ER, HER-2, and PR), prediction score based on the Xgboost model was still correlated with LNM (adjusted OR:2.73, 95% CI: 1.30-5.71, P=0.008).ConclusionsThe Xgboost model outperforms the traditional LR-based nomogram model in predicting the LNM of IMPC patients. Combined with SHAP, it can more intuitively reflect the influence of different variables on the LNM. The tumor size was the most important risk factor of LNM for breast IMPC patients. The prediction score obtained by the Xgboost model could be a good indicator for LNM.https://www.frontiersin.org/articles/10.3389/fonc.2022.981059/fullmachine learningSHAPIMPCnomogramlymph node metastasis |
spellingShingle | Cong Jiang Yuting Xiu Kun Qiao Xiao Yu Shiyuan Zhang Yuanxi Huang Prediction of lymph node metastasis in patients with breast invasive micropapillary carcinoma based on machine learning and SHapley Additive exPlanations framework Frontiers in Oncology machine learning SHAP IMPC nomogram lymph node metastasis |
title | Prediction of lymph node metastasis in patients with breast invasive micropapillary carcinoma based on machine learning and SHapley Additive exPlanations framework |
title_full | Prediction of lymph node metastasis in patients with breast invasive micropapillary carcinoma based on machine learning and SHapley Additive exPlanations framework |
title_fullStr | Prediction of lymph node metastasis in patients with breast invasive micropapillary carcinoma based on machine learning and SHapley Additive exPlanations framework |
title_full_unstemmed | Prediction of lymph node metastasis in patients with breast invasive micropapillary carcinoma based on machine learning and SHapley Additive exPlanations framework |
title_short | Prediction of lymph node metastasis in patients with breast invasive micropapillary carcinoma based on machine learning and SHapley Additive exPlanations framework |
title_sort | prediction of lymph node metastasis in patients with breast invasive micropapillary carcinoma based on machine learning and shapley additive explanations framework |
topic | machine learning SHAP IMPC nomogram lymph node metastasis |
url | https://www.frontiersin.org/articles/10.3389/fonc.2022.981059/full |
work_keys_str_mv | AT congjiang predictionoflymphnodemetastasisinpatientswithbreastinvasivemicropapillarycarcinomabasedonmachinelearningandshapleyadditiveexplanationsframework AT yutingxiu predictionoflymphnodemetastasisinpatientswithbreastinvasivemicropapillarycarcinomabasedonmachinelearningandshapleyadditiveexplanationsframework AT kunqiao predictionoflymphnodemetastasisinpatientswithbreastinvasivemicropapillarycarcinomabasedonmachinelearningandshapleyadditiveexplanationsframework AT xiaoyu predictionoflymphnodemetastasisinpatientswithbreastinvasivemicropapillarycarcinomabasedonmachinelearningandshapleyadditiveexplanationsframework AT shiyuanzhang predictionoflymphnodemetastasisinpatientswithbreastinvasivemicropapillarycarcinomabasedonmachinelearningandshapleyadditiveexplanationsframework AT yuanxihuang predictionoflymphnodemetastasisinpatientswithbreastinvasivemicropapillarycarcinomabasedonmachinelearningandshapleyadditiveexplanationsframework |