Machine Learning Models to Predict Primary Sites of Metastatic Cervical Carcinoma From Unknown Primary
Metastatic cervical carcinoma from unknown primary (MCCUP) accounts for 1–4% of all head and neck tumors, and identifying the primary site in MCCUP is challenging. The most common histopathological type of MCCUP is squamous cell carcinoma (SCC), and it remains difficult to identify the primary site...
Main Authors: | , , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2020-12-01
|
Series: | Frontiers in Genetics |
Subjects: | |
Online Access: | https://www.frontiersin.org/articles/10.3389/fgene.2020.614823/full |
_version_ | 1819295885919518720 |
---|---|
author | Di Lu Jianjun Jiang Xiguang Liu He Wang Siyang Feng Xiaoshun Shi Zhizhi Wang Zhiming Chen Xuebin Yan Hua Wu Kaican Cai |
author_facet | Di Lu Jianjun Jiang Xiguang Liu He Wang Siyang Feng Xiaoshun Shi Zhizhi Wang Zhiming Chen Xuebin Yan Hua Wu Kaican Cai |
author_sort | Di Lu |
collection | DOAJ |
description | Metastatic cervical carcinoma from unknown primary (MCCUP) accounts for 1–4% of all head and neck tumors, and identifying the primary site in MCCUP is challenging. The most common histopathological type of MCCUP is squamous cell carcinoma (SCC), and it remains difficult to identify the primary site pathologically. Therefore, it seems necessary and urgent to develop novel and effective methods to determine the primary site in MCCUP. In the present study, the RNA sequencing data of four types of SCC and Pan-Cancer from the cancer genome atlas (TCGA) were obtained. And after data pre-processing, their differentially expressed genes (DEGs) were identified, respectively. Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis indicated that these significantly changed genes of four types of SCC share lots of similar molecular functions and histological features. Then three machine learning models, [Random Forest (RF), support vector machine (SVM), and neural network (NN)] which consisted of ten genes to distinguish these four types of SCC were developed. Among the three models with prediction tests, the RF model worked best in the external validation set, with an overall predictive accuracy of 88.2%, sensitivity of 88.71%, and specificity of 95.42%. The NN model is the second in efficacy, with an overall accuracy of 82.02%, sensitivity of 81.23%, and specificity of 93.04%. The SVM model is the last, with an overall accuracy of 76.69%, sensitivity of 74.81%, and specificity of 90.84%. The present analysis of similarities and differences among the four types of SCC, and novel models developments for distinguishing four types of SCC with informatics methods shed lights on precision MCCUP diagnosis in the future. |
first_indexed | 2024-12-24T04:49:20Z |
format | Article |
id | doaj.art-0e05f3ffe7464ca6baf6ab813b670c96 |
institution | Directory Open Access Journal |
issn | 1664-8021 |
language | English |
last_indexed | 2024-12-24T04:49:20Z |
publishDate | 2020-12-01 |
publisher | Frontiers Media S.A. |
record_format | Article |
series | Frontiers in Genetics |
spelling | doaj.art-0e05f3ffe7464ca6baf6ab813b670c962022-12-21T17:14:36ZengFrontiers Media S.A.Frontiers in Genetics1664-80212020-12-011110.3389/fgene.2020.614823614823Machine Learning Models to Predict Primary Sites of Metastatic Cervical Carcinoma From Unknown PrimaryDi Lu0Jianjun Jiang1Xiguang Liu2He Wang3Siyang Feng4Xiaoshun Shi5Zhizhi Wang6Zhiming Chen7Xuebin Yan8Hua Wu9Kaican Cai10Department of Thoracic Surgery, Nanfang Hospital, Southern Medical University, Guangzhou, ChinaDepartment of Thoracic Surgery, Nanfang Hospital, Southern Medical University, Guangzhou, ChinaDepartment of Thoracic Surgery, Nanfang Hospital, Southern Medical University, Guangzhou, ChinaDepartment of Thoracic Surgery, Peking University Shenzhen Hospital, Shenzhen, ChinaDepartment of Thoracic Surgery, Nanfang Hospital, Southern Medical University, Guangzhou, ChinaDepartment of Thoracic Surgery, Nanfang Hospital, Southern Medical University, Guangzhou, ChinaDepartment of Thoracic Surgery, Nanfang Hospital, Southern Medical University, Guangzhou, ChinaDepartment of Thoracic Surgery, Nanfang Hospital, Southern Medical University, Guangzhou, ChinaDepartment of Thoracic Surgery, Nanfang Hospital, Southern Medical University, Guangzhou, ChinaDepartment of Thoracic Surgery, Nanfang Hospital, Southern Medical University, Guangzhou, ChinaDepartment of Thoracic Surgery, Nanfang Hospital, Southern Medical University, Guangzhou, ChinaMetastatic cervical carcinoma from unknown primary (MCCUP) accounts for 1–4% of all head and neck tumors, and identifying the primary site in MCCUP is challenging. The most common histopathological type of MCCUP is squamous cell carcinoma (SCC), and it remains difficult to identify the primary site pathologically. Therefore, it seems necessary and urgent to develop novel and effective methods to determine the primary site in MCCUP. In the present study, the RNA sequencing data of four types of SCC and Pan-Cancer from the cancer genome atlas (TCGA) were obtained. And after data pre-processing, their differentially expressed genes (DEGs) were identified, respectively. Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis indicated that these significantly changed genes of four types of SCC share lots of similar molecular functions and histological features. Then three machine learning models, [Random Forest (RF), support vector machine (SVM), and neural network (NN)] which consisted of ten genes to distinguish these four types of SCC were developed. Among the three models with prediction tests, the RF model worked best in the external validation set, with an overall predictive accuracy of 88.2%, sensitivity of 88.71%, and specificity of 95.42%. The NN model is the second in efficacy, with an overall accuracy of 82.02%, sensitivity of 81.23%, and specificity of 93.04%. The SVM model is the last, with an overall accuracy of 76.69%, sensitivity of 74.81%, and specificity of 90.84%. The present analysis of similarities and differences among the four types of SCC, and novel models developments for distinguishing four types of SCC with informatics methods shed lights on precision MCCUP diagnosis in the future.https://www.frontiersin.org/articles/10.3389/fgene.2020.614823/fullmetastatic cervical carcinoma from unknown primaryrandom forestneural networksupport vector machinepredictprimary sites |
spellingShingle | Di Lu Jianjun Jiang Xiguang Liu He Wang Siyang Feng Xiaoshun Shi Zhizhi Wang Zhiming Chen Xuebin Yan Hua Wu Kaican Cai Machine Learning Models to Predict Primary Sites of Metastatic Cervical Carcinoma From Unknown Primary Frontiers in Genetics metastatic cervical carcinoma from unknown primary random forest neural network support vector machine predict primary sites |
title | Machine Learning Models to Predict Primary Sites of Metastatic Cervical Carcinoma From Unknown Primary |
title_full | Machine Learning Models to Predict Primary Sites of Metastatic Cervical Carcinoma From Unknown Primary |
title_fullStr | Machine Learning Models to Predict Primary Sites of Metastatic Cervical Carcinoma From Unknown Primary |
title_full_unstemmed | Machine Learning Models to Predict Primary Sites of Metastatic Cervical Carcinoma From Unknown Primary |
title_short | Machine Learning Models to Predict Primary Sites of Metastatic Cervical Carcinoma From Unknown Primary |
title_sort | machine learning models to predict primary sites of metastatic cervical carcinoma from unknown primary |
topic | metastatic cervical carcinoma from unknown primary random forest neural network support vector machine predict primary sites |
url | https://www.frontiersin.org/articles/10.3389/fgene.2020.614823/full |
work_keys_str_mv | AT dilu machinelearningmodelstopredictprimarysitesofmetastaticcervicalcarcinomafromunknownprimary AT jianjunjiang machinelearningmodelstopredictprimarysitesofmetastaticcervicalcarcinomafromunknownprimary AT xiguangliu machinelearningmodelstopredictprimarysitesofmetastaticcervicalcarcinomafromunknownprimary AT hewang machinelearningmodelstopredictprimarysitesofmetastaticcervicalcarcinomafromunknownprimary AT siyangfeng machinelearningmodelstopredictprimarysitesofmetastaticcervicalcarcinomafromunknownprimary AT xiaoshunshi machinelearningmodelstopredictprimarysitesofmetastaticcervicalcarcinomafromunknownprimary AT zhizhiwang machinelearningmodelstopredictprimarysitesofmetastaticcervicalcarcinomafromunknownprimary AT zhimingchen machinelearningmodelstopredictprimarysitesofmetastaticcervicalcarcinomafromunknownprimary AT xuebinyan machinelearningmodelstopredictprimarysitesofmetastaticcervicalcarcinomafromunknownprimary AT huawu machinelearningmodelstopredictprimarysitesofmetastaticcervicalcarcinomafromunknownprimary AT kaicancai machinelearningmodelstopredictprimarysitesofmetastaticcervicalcarcinomafromunknownprimary |