A transcriptome-Based Deep Neural Network Classifier for Identifying the Site of Origin in Mucinous Cancer

Purpose: There is a lack of tools for identifying the site of origin in mucinous cancer. This study aimed to evaluate the performance of a transcriptome-based classifier for identifying the site of origin in mucinous cancer. Materials And Methods: Transcriptomic data of 1878 non-mucinous and 82 muci...

Full description

Bibliographic Details
Main Authors: Taejin Ahn, Kidong Kim, Hyojin Kim, Sarah Kim, Sangick Park, Kyoungbun Lee
Format: Article
Language:English
Published: SAGE Publishing 2022-11-01
Series:Cancer Informatics
Online Access:https://doi.org/10.1177/11769351221135141
_version_ 1811313273977438208
author Taejin Ahn
Kidong Kim
Hyojin Kim
Sarah Kim
Sangick Park
Kyoungbun Lee
author_facet Taejin Ahn
Kidong Kim
Hyojin Kim
Sarah Kim
Sangick Park
Kyoungbun Lee
author_sort Taejin Ahn
collection DOAJ
description Purpose: There is a lack of tools for identifying the site of origin in mucinous cancer. This study aimed to evaluate the performance of a transcriptome-based classifier for identifying the site of origin in mucinous cancer. Materials And Methods: Transcriptomic data of 1878 non-mucinous and 82 mucinous cancer specimens, with 7 sites of origin, namely, the uterine cervix (CESC), colon (COAD), pancreas (PAAD), stomach (STAD), uterine endometrium (UCEC), uterine carcinosarcoma (UCS), and ovary (OV), obtained from The Cancer Genome Atlas, were used as the training and validation sets, respectively. Transcriptomic data of 14 mucinous cancer specimens from a tissue archive were used as the test set. For identifying the site of origin, a set of 100 differentially expressed genes for each site of origin was selected. After removing multiple iterations of the same gene, 427 genes were chosen, and their RNA expression profiles, at each site of origin, were used to train the deep neural network classifier. The performance of the classifier was estimated using the training, validation, and test sets. Results: The accuracy of the model in the training set was 0.998, while that in the validation set was 0.939 (77/82). In the test set which is newly sequenced from a tissue archive, the model showed an accuracy of 0.857 (12/14). t-SNE analysis revealed that samples in the test set were part of the clusters obtained for the training set. Conclusion: Although limited by small sample size, we showed that a transcriptome-based classifier could correctly identify the site of origin of mucinous cancer.
first_indexed 2024-04-13T10:51:57Z
format Article
id doaj.art-12687f2cd1fd4788913db132956d84c0
institution Directory Open Access Journal
issn 1176-9351
language English
last_indexed 2024-04-13T10:51:57Z
publishDate 2022-11-01
publisher SAGE Publishing
record_format Article
series Cancer Informatics
spelling doaj.art-12687f2cd1fd4788913db132956d84c02022-12-22T02:49:38ZengSAGE PublishingCancer Informatics1176-93512022-11-012110.1177/11769351221135141A transcriptome-Based Deep Neural Network Classifier for Identifying the Site of Origin in Mucinous CancerTaejin Ahn0Kidong Kim1Hyojin Kim2Sarah Kim3Sangick Park4Kyoungbun Lee5Department of Life Science, Handong Global University, Pohang, Republic of KoreaDepartment of Obstetrics and Gynecology, Seoul National University Bundang Hospital, Seongnam, Republic of KoreaDepartment of Pathology, Seoul National University Bundang Hospital, Seongnam, Republic of KoreaDepartment of Life Science, Handong Global University, Pohang, Republic of KoreaDepartment of Life Science, Handong Global University, Pohang, Republic of KoreaDepartment of Pathology, Seoul National University Hospital, Seoul, Republic of KoreaPurpose: There is a lack of tools for identifying the site of origin in mucinous cancer. This study aimed to evaluate the performance of a transcriptome-based classifier for identifying the site of origin in mucinous cancer. Materials And Methods: Transcriptomic data of 1878 non-mucinous and 82 mucinous cancer specimens, with 7 sites of origin, namely, the uterine cervix (CESC), colon (COAD), pancreas (PAAD), stomach (STAD), uterine endometrium (UCEC), uterine carcinosarcoma (UCS), and ovary (OV), obtained from The Cancer Genome Atlas, were used as the training and validation sets, respectively. Transcriptomic data of 14 mucinous cancer specimens from a tissue archive were used as the test set. For identifying the site of origin, a set of 100 differentially expressed genes for each site of origin was selected. After removing multiple iterations of the same gene, 427 genes were chosen, and their RNA expression profiles, at each site of origin, were used to train the deep neural network classifier. The performance of the classifier was estimated using the training, validation, and test sets. Results: The accuracy of the model in the training set was 0.998, while that in the validation set was 0.939 (77/82). In the test set which is newly sequenced from a tissue archive, the model showed an accuracy of 0.857 (12/14). t-SNE analysis revealed that samples in the test set were part of the clusters obtained for the training set. Conclusion: Although limited by small sample size, we showed that a transcriptome-based classifier could correctly identify the site of origin of mucinous cancer.https://doi.org/10.1177/11769351221135141
spellingShingle Taejin Ahn
Kidong Kim
Hyojin Kim
Sarah Kim
Sangick Park
Kyoungbun Lee
A transcriptome-Based Deep Neural Network Classifier for Identifying the Site of Origin in Mucinous Cancer
Cancer Informatics
title A transcriptome-Based Deep Neural Network Classifier for Identifying the Site of Origin in Mucinous Cancer
title_full A transcriptome-Based Deep Neural Network Classifier for Identifying the Site of Origin in Mucinous Cancer
title_fullStr A transcriptome-Based Deep Neural Network Classifier for Identifying the Site of Origin in Mucinous Cancer
title_full_unstemmed A transcriptome-Based Deep Neural Network Classifier for Identifying the Site of Origin in Mucinous Cancer
title_short A transcriptome-Based Deep Neural Network Classifier for Identifying the Site of Origin in Mucinous Cancer
title_sort transcriptome based deep neural network classifier for identifying the site of origin in mucinous cancer
url https://doi.org/10.1177/11769351221135141
work_keys_str_mv AT taejinahn atranscriptomebaseddeepneuralnetworkclassifierforidentifyingthesiteoforigininmucinouscancer
AT kidongkim atranscriptomebaseddeepneuralnetworkclassifierforidentifyingthesiteoforigininmucinouscancer
AT hyojinkim atranscriptomebaseddeepneuralnetworkclassifierforidentifyingthesiteoforigininmucinouscancer
AT sarahkim atranscriptomebaseddeepneuralnetworkclassifierforidentifyingthesiteoforigininmucinouscancer
AT sangickpark atranscriptomebaseddeepneuralnetworkclassifierforidentifyingthesiteoforigininmucinouscancer
AT kyoungbunlee atranscriptomebaseddeepneuralnetworkclassifierforidentifyingthesiteoforigininmucinouscancer
AT taejinahn transcriptomebaseddeepneuralnetworkclassifierforidentifyingthesiteoforigininmucinouscancer
AT kidongkim transcriptomebaseddeepneuralnetworkclassifierforidentifyingthesiteoforigininmucinouscancer
AT hyojinkim transcriptomebaseddeepneuralnetworkclassifierforidentifyingthesiteoforigininmucinouscancer
AT sarahkim transcriptomebaseddeepneuralnetworkclassifierforidentifyingthesiteoforigininmucinouscancer
AT sangickpark transcriptomebaseddeepneuralnetworkclassifierforidentifyingthesiteoforigininmucinouscancer
AT kyoungbunlee transcriptomebaseddeepneuralnetworkclassifierforidentifyingthesiteoforigininmucinouscancer