Gene Expression Analysis for Uterine Cervix and Corpus Cancer Characterization
The analysis of gene expression quantification data is a powerful and widely used approach in cancer research. This work provides new insights into the transcriptomic changes that occur in healthy uterine tissue compared to those in cancerous tissues and explores the differences associated with uter...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2024-02-01
|
Series: | Genes |
Subjects: | |
Online Access: | https://www.mdpi.com/2073-4425/15/3/312 |
_version_ | 1797240918705700864 |
---|---|
author | Lucía Almorox Laura Antequera Ignacio Rojas Luis Javier Herrera Francisco M. Ortuño |
author_facet | Lucía Almorox Laura Antequera Ignacio Rojas Luis Javier Herrera Francisco M. Ortuño |
author_sort | Lucía Almorox |
collection | DOAJ |
description | The analysis of gene expression quantification data is a powerful and widely used approach in cancer research. This work provides new insights into the transcriptomic changes that occur in healthy uterine tissue compared to those in cancerous tissues and explores the differences associated with uterine cancer localizations and histological subtypes. To achieve this, RNA-Seq data from the TCGA database were preprocessed and analyzed using the KnowSeq package. Firstly, a kNN model was applied to classify uterine cervix cancer, uterine corpus cancer, and healthy uterine samples. Through variable selection, a three-gene signature was identified (<i>VWCE</i>, <i>CLDN15</i>, <i>ADCYAP1R1</i>), achieving consistent 100% test accuracy across 20 repetitions of a 5-fold cross-validation. A supplementary similar analysis using miRNA-Seq data from the same samples identified an optimal two-gene miRNA-coding signature potentially regulating the three-gene signature previously mentioned, which attained optimal classification performance with an 82% F1-macro score. Subsequently, a kNN model was implemented for the classification of cervical cancer samples into their two main histological subtypes (adenocarcinoma and squamous cell carcinoma). A uni-gene signature (<i>ICA1L</i>) was identified, achieving 100% test accuracy through 20 repetitions of a 5-fold cross-validation and externally validated through the CGCI program. Finally, an examination of six cervical adenosquamous carcinoma (mixed) samples revealed a pattern where the gene expression value in the mixed class aligned closer to the histological subtype with lower expression, prompting a reconsideration of the diagnosis for these mixed samples. In summary, this study provides valuable insights into the molecular mechanisms of uterine cervix and corpus cancers. The newly identified gene signatures demonstrate robust predictive capabilities, guiding future research in cancer diagnosis and treatment methodologies. |
first_indexed | 2024-04-24T18:15:04Z |
format | Article |
id | doaj.art-c3703efa1cb34029934a84d56b778b22 |
institution | Directory Open Access Journal |
issn | 2073-4425 |
language | English |
last_indexed | 2024-04-24T18:15:04Z |
publishDate | 2024-02-01 |
publisher | MDPI AG |
record_format | Article |
series | Genes |
spelling | doaj.art-c3703efa1cb34029934a84d56b778b222024-03-27T13:43:02ZengMDPI AGGenes2073-44252024-02-0115331210.3390/genes15030312Gene Expression Analysis for Uterine Cervix and Corpus Cancer CharacterizationLucía Almorox0Laura Antequera1Ignacio Rojas2Luis Javier Herrera3Francisco M. Ortuño4Department of Computer Engineering, Automatics and Robotics, C.I.T.I.C., University of Granada, Periodista Rafael Gómez Montero, 2, 18014 Granada, SpainDepartment of Computer Engineering, Automatics and Robotics, C.I.T.I.C., University of Granada, Periodista Rafael Gómez Montero, 2, 18014 Granada, SpainDepartment of Computer Engineering, Automatics and Robotics, C.I.T.I.C., University of Granada, Periodista Rafael Gómez Montero, 2, 18014 Granada, SpainDepartment of Computer Engineering, Automatics and Robotics, C.I.T.I.C., University of Granada, Periodista Rafael Gómez Montero, 2, 18014 Granada, SpainDepartment of Computer Engineering, Automatics and Robotics, C.I.T.I.C., University of Granada, Periodista Rafael Gómez Montero, 2, 18014 Granada, SpainThe analysis of gene expression quantification data is a powerful and widely used approach in cancer research. This work provides new insights into the transcriptomic changes that occur in healthy uterine tissue compared to those in cancerous tissues and explores the differences associated with uterine cancer localizations and histological subtypes. To achieve this, RNA-Seq data from the TCGA database were preprocessed and analyzed using the KnowSeq package. Firstly, a kNN model was applied to classify uterine cervix cancer, uterine corpus cancer, and healthy uterine samples. Through variable selection, a three-gene signature was identified (<i>VWCE</i>, <i>CLDN15</i>, <i>ADCYAP1R1</i>), achieving consistent 100% test accuracy across 20 repetitions of a 5-fold cross-validation. A supplementary similar analysis using miRNA-Seq data from the same samples identified an optimal two-gene miRNA-coding signature potentially regulating the three-gene signature previously mentioned, which attained optimal classification performance with an 82% F1-macro score. Subsequently, a kNN model was implemented for the classification of cervical cancer samples into their two main histological subtypes (adenocarcinoma and squamous cell carcinoma). A uni-gene signature (<i>ICA1L</i>) was identified, achieving 100% test accuracy through 20 repetitions of a 5-fold cross-validation and externally validated through the CGCI program. Finally, an examination of six cervical adenosquamous carcinoma (mixed) samples revealed a pattern where the gene expression value in the mixed class aligned closer to the histological subtype with lower expression, prompting a reconsideration of the diagnosis for these mixed samples. In summary, this study provides valuable insights into the molecular mechanisms of uterine cervix and corpus cancers. The newly identified gene signatures demonstrate robust predictive capabilities, guiding future research in cancer diagnosis and treatment methodologies.https://www.mdpi.com/2073-4425/15/3/312uterine corpus cancercervical cancercervical adenocarcinomacervical squamous cell carcinomaKnowSeqRNA-Seq |
spellingShingle | Lucía Almorox Laura Antequera Ignacio Rojas Luis Javier Herrera Francisco M. Ortuño Gene Expression Analysis for Uterine Cervix and Corpus Cancer Characterization Genes uterine corpus cancer cervical cancer cervical adenocarcinoma cervical squamous cell carcinoma KnowSeq RNA-Seq |
title | Gene Expression Analysis for Uterine Cervix and Corpus Cancer Characterization |
title_full | Gene Expression Analysis for Uterine Cervix and Corpus Cancer Characterization |
title_fullStr | Gene Expression Analysis for Uterine Cervix and Corpus Cancer Characterization |
title_full_unstemmed | Gene Expression Analysis for Uterine Cervix and Corpus Cancer Characterization |
title_short | Gene Expression Analysis for Uterine Cervix and Corpus Cancer Characterization |
title_sort | gene expression analysis for uterine cervix and corpus cancer characterization |
topic | uterine corpus cancer cervical cancer cervical adenocarcinoma cervical squamous cell carcinoma KnowSeq RNA-Seq |
url | https://www.mdpi.com/2073-4425/15/3/312 |
work_keys_str_mv | AT luciaalmorox geneexpressionanalysisforuterinecervixandcorpuscancercharacterization AT lauraantequera geneexpressionanalysisforuterinecervixandcorpuscancercharacterization AT ignaciorojas geneexpressionanalysisforuterinecervixandcorpuscancercharacterization AT luisjavierherrera geneexpressionanalysisforuterinecervixandcorpuscancercharacterization AT franciscomortuno geneexpressionanalysisforuterinecervixandcorpuscancercharacterization |