Prediction of EGFR Mutation Status in Non–Small Cell Lung Cancer Based on Ensemble Learning

Objectives: We aimed to identify whether ensemble learning can improve the performance of the epidermal growth factor receptor (EGFR) mutation status predicting model.Methods: We retrospectively collected 168 patients with non–small cell lung cancer (NSCLC), who underwent both computed tomography (C...

Full description

Bibliographic Details
Main Authors: Youdan Feng, Fan Song, Peng Zhang, Guangda Fan, Tianyi Zhang, Xiangyu Zhao, Chenbin Ma, Yangyang Sun, Xiao Song, Huangsheng Pu, Fei Liu, Guanglei Zhang
Format: Article
Language:English
Published: Frontiers Media S.A. 2022-06-01
Series:Frontiers in Pharmacology
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fphar.2022.897597/full
_version_ 1811237686665543680
author Youdan Feng
Fan Song
Peng Zhang
Guangda Fan
Tianyi Zhang
Xiangyu Zhao
Chenbin Ma
Yangyang Sun
Xiao Song
Huangsheng Pu
Fei Liu
Guanglei Zhang
author_facet Youdan Feng
Fan Song
Peng Zhang
Guangda Fan
Tianyi Zhang
Xiangyu Zhao
Chenbin Ma
Yangyang Sun
Xiao Song
Huangsheng Pu
Fei Liu
Guanglei Zhang
author_sort Youdan Feng
collection DOAJ
description Objectives: We aimed to identify whether ensemble learning can improve the performance of the epidermal growth factor receptor (EGFR) mutation status predicting model.Methods: We retrospectively collected 168 patients with non–small cell lung cancer (NSCLC), who underwent both computed tomography (CT) examination and EGFR test. Using the radiomics features extracted from the CT images, an ensemble model was established with four individual classifiers: logistic regression (LR), support vector machine (SVM), random forest (RF), and extreme gradient boosting (XGBoost). The synthetic minority oversampling technique (SMOTE) was also used to decrease the influence of data imbalance. The performances of the predicting model were evaluated using the area under the curve (AUC).Results: Based on the 26 radiomics features after feature selection, the SVM performed best (AUCs of 0.8634 and 0.7885 on the training and test sets, respectively) among four individual classifiers. The ensemble model of RF, XGBoost, and LR achieved the best performance (AUCs of 0.8465 and 0.8654 on the training and test sets, respectively).Conclusion: Ensemble learning can improve the model performance in predicting the EGFR mutation status of patients with NSCLC, showing potential value in clinical practice.
first_indexed 2024-04-12T12:27:43Z
format Article
id doaj.art-5f74726c605c423b9e8f807b8594e19f
institution Directory Open Access Journal
issn 1663-9812
language English
last_indexed 2024-04-12T12:27:43Z
publishDate 2022-06-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Pharmacology
spelling doaj.art-5f74726c605c423b9e8f807b8594e19f2022-12-22T03:33:07ZengFrontiers Media S.A.Frontiers in Pharmacology1663-98122022-06-011310.3389/fphar.2022.897597897597Prediction of EGFR Mutation Status in Non–Small Cell Lung Cancer Based on Ensemble LearningYoudan Feng0Fan Song1Peng Zhang2Guangda Fan3Tianyi Zhang4Xiangyu Zhao5Chenbin Ma6Yangyang Sun7Xiao Song8Huangsheng Pu9Fei Liu10Guanglei Zhang11Beijing Advanced Innovation Center for Biomedical Engineering, School of Biological Science and Medical Engineering, Beihang University, Beijing, ChinaBeijing Advanced Innovation Center for Biomedical Engineering, School of Biological Science and Medical Engineering, Beihang University, Beijing, ChinaBeijing Advanced Innovation Center for Biomedical Engineering, School of Biological Science and Medical Engineering, Beihang University, Beijing, ChinaBeijing Advanced Innovation Center for Biomedical Engineering, School of Biological Science and Medical Engineering, Beihang University, Beijing, ChinaBeijing Advanced Innovation Center for Biomedical Engineering, School of Biological Science and Medical Engineering, Beihang University, Beijing, ChinaBeijing Advanced Innovation Center for Biomedical Engineering, School of Biological Science and Medical Engineering, Beihang University, Beijing, ChinaBeijing Advanced Innovation Center for Biomedical Engineering, School of Biological Science and Medical Engineering, Beihang University, Beijing, ChinaBeijing Advanced Innovation Center for Biomedical Engineering, School of Biological Science and Medical Engineering, Beihang University, Beijing, ChinaSchool of Medical Imaging, Shanxi Medical University, Taiyuan, ChinaCollege of Advanced Interdisciplinary Studies, National University of Defense Technology, Changsha, ChinaBeijing Advanced Information and Industrial Technology Research Institute, Beijing Information Science and Technology University, Beijing, ChinaBeijing Advanced Innovation Center for Biomedical Engineering, School of Biological Science and Medical Engineering, Beihang University, Beijing, ChinaObjectives: We aimed to identify whether ensemble learning can improve the performance of the epidermal growth factor receptor (EGFR) mutation status predicting model.Methods: We retrospectively collected 168 patients with non–small cell lung cancer (NSCLC), who underwent both computed tomography (CT) examination and EGFR test. Using the radiomics features extracted from the CT images, an ensemble model was established with four individual classifiers: logistic regression (LR), support vector machine (SVM), random forest (RF), and extreme gradient boosting (XGBoost). The synthetic minority oversampling technique (SMOTE) was also used to decrease the influence of data imbalance. The performances of the predicting model were evaluated using the area under the curve (AUC).Results: Based on the 26 radiomics features after feature selection, the SVM performed best (AUCs of 0.8634 and 0.7885 on the training and test sets, respectively) among four individual classifiers. The ensemble model of RF, XGBoost, and LR achieved the best performance (AUCs of 0.8465 and 0.8654 on the training and test sets, respectively).Conclusion: Ensemble learning can improve the model performance in predicting the EGFR mutation status of patients with NSCLC, showing potential value in clinical practice.https://www.frontiersin.org/articles/10.3389/fphar.2022.897597/fullnon–small cell lung cancerradiogenomicsEGFRcomputed tomographyensemble learning
spellingShingle Youdan Feng
Fan Song
Peng Zhang
Guangda Fan
Tianyi Zhang
Xiangyu Zhao
Chenbin Ma
Yangyang Sun
Xiao Song
Huangsheng Pu
Fei Liu
Guanglei Zhang
Prediction of EGFR Mutation Status in Non–Small Cell Lung Cancer Based on Ensemble Learning
Frontiers in Pharmacology
non–small cell lung cancer
radiogenomics
EGFR
computed tomography
ensemble learning
title Prediction of EGFR Mutation Status in Non–Small Cell Lung Cancer Based on Ensemble Learning
title_full Prediction of EGFR Mutation Status in Non–Small Cell Lung Cancer Based on Ensemble Learning
title_fullStr Prediction of EGFR Mutation Status in Non–Small Cell Lung Cancer Based on Ensemble Learning
title_full_unstemmed Prediction of EGFR Mutation Status in Non–Small Cell Lung Cancer Based on Ensemble Learning
title_short Prediction of EGFR Mutation Status in Non–Small Cell Lung Cancer Based on Ensemble Learning
title_sort prediction of egfr mutation status in non small cell lung cancer based on ensemble learning
topic non–small cell lung cancer
radiogenomics
EGFR
computed tomography
ensemble learning
url https://www.frontiersin.org/articles/10.3389/fphar.2022.897597/full
work_keys_str_mv AT youdanfeng predictionofegfrmutationstatusinnonsmallcelllungcancerbasedonensemblelearning
AT fansong predictionofegfrmutationstatusinnonsmallcelllungcancerbasedonensemblelearning
AT pengzhang predictionofegfrmutationstatusinnonsmallcelllungcancerbasedonensemblelearning
AT guangdafan predictionofegfrmutationstatusinnonsmallcelllungcancerbasedonensemblelearning
AT tianyizhang predictionofegfrmutationstatusinnonsmallcelllungcancerbasedonensemblelearning
AT xiangyuzhao predictionofegfrmutationstatusinnonsmallcelllungcancerbasedonensemblelearning
AT chenbinma predictionofegfrmutationstatusinnonsmallcelllungcancerbasedonensemblelearning
AT yangyangsun predictionofegfrmutationstatusinnonsmallcelllungcancerbasedonensemblelearning
AT xiaosong predictionofegfrmutationstatusinnonsmallcelllungcancerbasedonensemblelearning
AT huangshengpu predictionofegfrmutationstatusinnonsmallcelllungcancerbasedonensemblelearning
AT feiliu predictionofegfrmutationstatusinnonsmallcelllungcancerbasedonensemblelearning
AT guangleizhang predictionofegfrmutationstatusinnonsmallcelllungcancerbasedonensemblelearning