Machine Learning Models to Predict Primary Sites of Metastatic Cervical Carcinoma From Unknown Primary

Metastatic cervical carcinoma from unknown primary (MCCUP) accounts for 1–4% of all head and neck tumors, and identifying the primary site in MCCUP is challenging. The most common histopathological type of MCCUP is squamous cell carcinoma (SCC), and it remains difficult to identify the primary site...

Full description

Bibliographic Details
Main Authors: Di Lu, Jianjun Jiang, Xiguang Liu, He Wang, Siyang Feng, Xiaoshun Shi, Zhizhi Wang, Zhiming Chen, Xuebin Yan, Hua Wu, Kaican Cai
Format: Article
Language:English
Published: Frontiers Media S.A. 2020-12-01
Series:Frontiers in Genetics
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fgene.2020.614823/full
_version_ 1819295885919518720
author Di Lu
Jianjun Jiang
Xiguang Liu
He Wang
Siyang Feng
Xiaoshun Shi
Zhizhi Wang
Zhiming Chen
Xuebin Yan
Hua Wu
Kaican Cai
author_facet Di Lu
Jianjun Jiang
Xiguang Liu
He Wang
Siyang Feng
Xiaoshun Shi
Zhizhi Wang
Zhiming Chen
Xuebin Yan
Hua Wu
Kaican Cai
author_sort Di Lu
collection DOAJ
description Metastatic cervical carcinoma from unknown primary (MCCUP) accounts for 1–4% of all head and neck tumors, and identifying the primary site in MCCUP is challenging. The most common histopathological type of MCCUP is squamous cell carcinoma (SCC), and it remains difficult to identify the primary site pathologically. Therefore, it seems necessary and urgent to develop novel and effective methods to determine the primary site in MCCUP. In the present study, the RNA sequencing data of four types of SCC and Pan-Cancer from the cancer genome atlas (TCGA) were obtained. And after data pre-processing, their differentially expressed genes (DEGs) were identified, respectively. Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis indicated that these significantly changed genes of four types of SCC share lots of similar molecular functions and histological features. Then three machine learning models, [Random Forest (RF), support vector machine (SVM), and neural network (NN)] which consisted of ten genes to distinguish these four types of SCC were developed. Among the three models with prediction tests, the RF model worked best in the external validation set, with an overall predictive accuracy of 88.2%, sensitivity of 88.71%, and specificity of 95.42%. The NN model is the second in efficacy, with an overall accuracy of 82.02%, sensitivity of 81.23%, and specificity of 93.04%. The SVM model is the last, with an overall accuracy of 76.69%, sensitivity of 74.81%, and specificity of 90.84%. The present analysis of similarities and differences among the four types of SCC, and novel models developments for distinguishing four types of SCC with informatics methods shed lights on precision MCCUP diagnosis in the future.
first_indexed 2024-12-24T04:49:20Z
format Article
id doaj.art-0e05f3ffe7464ca6baf6ab813b670c96
institution Directory Open Access Journal
issn 1664-8021
language English
last_indexed 2024-12-24T04:49:20Z
publishDate 2020-12-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Genetics
spelling doaj.art-0e05f3ffe7464ca6baf6ab813b670c962022-12-21T17:14:36ZengFrontiers Media S.A.Frontiers in Genetics1664-80212020-12-011110.3389/fgene.2020.614823614823Machine Learning Models to Predict Primary Sites of Metastatic Cervical Carcinoma From Unknown PrimaryDi Lu0Jianjun Jiang1Xiguang Liu2He Wang3Siyang Feng4Xiaoshun Shi5Zhizhi Wang6Zhiming Chen7Xuebin Yan8Hua Wu9Kaican Cai10Department of Thoracic Surgery, Nanfang Hospital, Southern Medical University, Guangzhou, ChinaDepartment of Thoracic Surgery, Nanfang Hospital, Southern Medical University, Guangzhou, ChinaDepartment of Thoracic Surgery, Nanfang Hospital, Southern Medical University, Guangzhou, ChinaDepartment of Thoracic Surgery, Peking University Shenzhen Hospital, Shenzhen, ChinaDepartment of Thoracic Surgery, Nanfang Hospital, Southern Medical University, Guangzhou, ChinaDepartment of Thoracic Surgery, Nanfang Hospital, Southern Medical University, Guangzhou, ChinaDepartment of Thoracic Surgery, Nanfang Hospital, Southern Medical University, Guangzhou, ChinaDepartment of Thoracic Surgery, Nanfang Hospital, Southern Medical University, Guangzhou, ChinaDepartment of Thoracic Surgery, Nanfang Hospital, Southern Medical University, Guangzhou, ChinaDepartment of Thoracic Surgery, Nanfang Hospital, Southern Medical University, Guangzhou, ChinaDepartment of Thoracic Surgery, Nanfang Hospital, Southern Medical University, Guangzhou, ChinaMetastatic cervical carcinoma from unknown primary (MCCUP) accounts for 1–4% of all head and neck tumors, and identifying the primary site in MCCUP is challenging. The most common histopathological type of MCCUP is squamous cell carcinoma (SCC), and it remains difficult to identify the primary site pathologically. Therefore, it seems necessary and urgent to develop novel and effective methods to determine the primary site in MCCUP. In the present study, the RNA sequencing data of four types of SCC and Pan-Cancer from the cancer genome atlas (TCGA) were obtained. And after data pre-processing, their differentially expressed genes (DEGs) were identified, respectively. Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis indicated that these significantly changed genes of four types of SCC share lots of similar molecular functions and histological features. Then three machine learning models, [Random Forest (RF), support vector machine (SVM), and neural network (NN)] which consisted of ten genes to distinguish these four types of SCC were developed. Among the three models with prediction tests, the RF model worked best in the external validation set, with an overall predictive accuracy of 88.2%, sensitivity of 88.71%, and specificity of 95.42%. The NN model is the second in efficacy, with an overall accuracy of 82.02%, sensitivity of 81.23%, and specificity of 93.04%. The SVM model is the last, with an overall accuracy of 76.69%, sensitivity of 74.81%, and specificity of 90.84%. The present analysis of similarities and differences among the four types of SCC, and novel models developments for distinguishing four types of SCC with informatics methods shed lights on precision MCCUP diagnosis in the future.https://www.frontiersin.org/articles/10.3389/fgene.2020.614823/fullmetastatic cervical carcinoma from unknown primaryrandom forestneural networksupport vector machinepredictprimary sites
spellingShingle Di Lu
Jianjun Jiang
Xiguang Liu
He Wang
Siyang Feng
Xiaoshun Shi
Zhizhi Wang
Zhiming Chen
Xuebin Yan
Hua Wu
Kaican Cai
Machine Learning Models to Predict Primary Sites of Metastatic Cervical Carcinoma From Unknown Primary
Frontiers in Genetics
metastatic cervical carcinoma from unknown primary
random forest
neural network
support vector machine
predict
primary sites
title Machine Learning Models to Predict Primary Sites of Metastatic Cervical Carcinoma From Unknown Primary
title_full Machine Learning Models to Predict Primary Sites of Metastatic Cervical Carcinoma From Unknown Primary
title_fullStr Machine Learning Models to Predict Primary Sites of Metastatic Cervical Carcinoma From Unknown Primary
title_full_unstemmed Machine Learning Models to Predict Primary Sites of Metastatic Cervical Carcinoma From Unknown Primary
title_short Machine Learning Models to Predict Primary Sites of Metastatic Cervical Carcinoma From Unknown Primary
title_sort machine learning models to predict primary sites of metastatic cervical carcinoma from unknown primary
topic metastatic cervical carcinoma from unknown primary
random forest
neural network
support vector machine
predict
primary sites
url https://www.frontiersin.org/articles/10.3389/fgene.2020.614823/full
work_keys_str_mv AT dilu machinelearningmodelstopredictprimarysitesofmetastaticcervicalcarcinomafromunknownprimary
AT jianjunjiang machinelearningmodelstopredictprimarysitesofmetastaticcervicalcarcinomafromunknownprimary
AT xiguangliu machinelearningmodelstopredictprimarysitesofmetastaticcervicalcarcinomafromunknownprimary
AT hewang machinelearningmodelstopredictprimarysitesofmetastaticcervicalcarcinomafromunknownprimary
AT siyangfeng machinelearningmodelstopredictprimarysitesofmetastaticcervicalcarcinomafromunknownprimary
AT xiaoshunshi machinelearningmodelstopredictprimarysitesofmetastaticcervicalcarcinomafromunknownprimary
AT zhizhiwang machinelearningmodelstopredictprimarysitesofmetastaticcervicalcarcinomafromunknownprimary
AT zhimingchen machinelearningmodelstopredictprimarysitesofmetastaticcervicalcarcinomafromunknownprimary
AT xuebinyan machinelearningmodelstopredictprimarysitesofmetastaticcervicalcarcinomafromunknownprimary
AT huawu machinelearningmodelstopredictprimarysitesofmetastaticcervicalcarcinomafromunknownprimary
AT kaicancai machinelearningmodelstopredictprimarysitesofmetastaticcervicalcarcinomafromunknownprimary