Gene Expression Analysis for Uterine Cervix and Corpus Cancer Characterization

The analysis of gene expression quantification data is a powerful and widely used approach in cancer research. This work provides new insights into the transcriptomic changes that occur in healthy uterine tissue compared to those in cancerous tissues and explores the differences associated with uter...

Full description

Bibliographic Details
Main Authors: Lucía Almorox, Laura Antequera, Ignacio Rojas, Luis Javier Herrera, Francisco M. Ortuño
Format: Article
Language:English
Published: MDPI AG 2024-02-01
Series:Genes
Subjects:
Online Access:https://www.mdpi.com/2073-4425/15/3/312
_version_ 1797240918705700864
author Lucía Almorox
Laura Antequera
Ignacio Rojas
Luis Javier Herrera
Francisco M. Ortuño
author_facet Lucía Almorox
Laura Antequera
Ignacio Rojas
Luis Javier Herrera
Francisco M. Ortuño
author_sort Lucía Almorox
collection DOAJ
description The analysis of gene expression quantification data is a powerful and widely used approach in cancer research. This work provides new insights into the transcriptomic changes that occur in healthy uterine tissue compared to those in cancerous tissues and explores the differences associated with uterine cancer localizations and histological subtypes. To achieve this, RNA-Seq data from the TCGA database were preprocessed and analyzed using the KnowSeq package. Firstly, a kNN model was applied to classify uterine cervix cancer, uterine corpus cancer, and healthy uterine samples. Through variable selection, a three-gene signature was identified (<i>VWCE</i>, <i>CLDN15</i>, <i>ADCYAP1R1</i>), achieving consistent 100% test accuracy across 20 repetitions of a 5-fold cross-validation. A supplementary similar analysis using miRNA-Seq data from the same samples identified an optimal two-gene miRNA-coding signature potentially regulating the three-gene signature previously mentioned, which attained optimal classification performance with an 82% F1-macro score. Subsequently, a kNN model was implemented for the classification of cervical cancer samples into their two main histological subtypes (adenocarcinoma and squamous cell carcinoma). A uni-gene signature (<i>ICA1L</i>) was identified, achieving 100% test accuracy through 20 repetitions of a 5-fold cross-validation and externally validated through the CGCI program. Finally, an examination of six cervical adenosquamous carcinoma (mixed) samples revealed a pattern where the gene expression value in the mixed class aligned closer to the histological subtype with lower expression, prompting a reconsideration of the diagnosis for these mixed samples. In summary, this study provides valuable insights into the molecular mechanisms of uterine cervix and corpus cancers. The newly identified gene signatures demonstrate robust predictive capabilities, guiding future research in cancer diagnosis and treatment methodologies.
first_indexed 2024-04-24T18:15:04Z
format Article
id doaj.art-c3703efa1cb34029934a84d56b778b22
institution Directory Open Access Journal
issn 2073-4425
language English
last_indexed 2024-04-24T18:15:04Z
publishDate 2024-02-01
publisher MDPI AG
record_format Article
series Genes
spelling doaj.art-c3703efa1cb34029934a84d56b778b222024-03-27T13:43:02ZengMDPI AGGenes2073-44252024-02-0115331210.3390/genes15030312Gene Expression Analysis for Uterine Cervix and Corpus Cancer CharacterizationLucía Almorox0Laura Antequera1Ignacio Rojas2Luis Javier Herrera3Francisco M. Ortuño4Department of Computer Engineering, Automatics and Robotics, C.I.T.I.C., University of Granada, Periodista Rafael Gómez Montero, 2, 18014 Granada, SpainDepartment of Computer Engineering, Automatics and Robotics, C.I.T.I.C., University of Granada, Periodista Rafael Gómez Montero, 2, 18014 Granada, SpainDepartment of Computer Engineering, Automatics and Robotics, C.I.T.I.C., University of Granada, Periodista Rafael Gómez Montero, 2, 18014 Granada, SpainDepartment of Computer Engineering, Automatics and Robotics, C.I.T.I.C., University of Granada, Periodista Rafael Gómez Montero, 2, 18014 Granada, SpainDepartment of Computer Engineering, Automatics and Robotics, C.I.T.I.C., University of Granada, Periodista Rafael Gómez Montero, 2, 18014 Granada, SpainThe analysis of gene expression quantification data is a powerful and widely used approach in cancer research. This work provides new insights into the transcriptomic changes that occur in healthy uterine tissue compared to those in cancerous tissues and explores the differences associated with uterine cancer localizations and histological subtypes. To achieve this, RNA-Seq data from the TCGA database were preprocessed and analyzed using the KnowSeq package. Firstly, a kNN model was applied to classify uterine cervix cancer, uterine corpus cancer, and healthy uterine samples. Through variable selection, a three-gene signature was identified (<i>VWCE</i>, <i>CLDN15</i>, <i>ADCYAP1R1</i>), achieving consistent 100% test accuracy across 20 repetitions of a 5-fold cross-validation. A supplementary similar analysis using miRNA-Seq data from the same samples identified an optimal two-gene miRNA-coding signature potentially regulating the three-gene signature previously mentioned, which attained optimal classification performance with an 82% F1-macro score. Subsequently, a kNN model was implemented for the classification of cervical cancer samples into their two main histological subtypes (adenocarcinoma and squamous cell carcinoma). A uni-gene signature (<i>ICA1L</i>) was identified, achieving 100% test accuracy through 20 repetitions of a 5-fold cross-validation and externally validated through the CGCI program. Finally, an examination of six cervical adenosquamous carcinoma (mixed) samples revealed a pattern where the gene expression value in the mixed class aligned closer to the histological subtype with lower expression, prompting a reconsideration of the diagnosis for these mixed samples. In summary, this study provides valuable insights into the molecular mechanisms of uterine cervix and corpus cancers. The newly identified gene signatures demonstrate robust predictive capabilities, guiding future research in cancer diagnosis and treatment methodologies.https://www.mdpi.com/2073-4425/15/3/312uterine corpus cancercervical cancercervical adenocarcinomacervical squamous cell carcinomaKnowSeqRNA-Seq
spellingShingle Lucía Almorox
Laura Antequera
Ignacio Rojas
Luis Javier Herrera
Francisco M. Ortuño
Gene Expression Analysis for Uterine Cervix and Corpus Cancer Characterization
Genes
uterine corpus cancer
cervical cancer
cervical adenocarcinoma
cervical squamous cell carcinoma
KnowSeq
RNA-Seq
title Gene Expression Analysis for Uterine Cervix and Corpus Cancer Characterization
title_full Gene Expression Analysis for Uterine Cervix and Corpus Cancer Characterization
title_fullStr Gene Expression Analysis for Uterine Cervix and Corpus Cancer Characterization
title_full_unstemmed Gene Expression Analysis for Uterine Cervix and Corpus Cancer Characterization
title_short Gene Expression Analysis for Uterine Cervix and Corpus Cancer Characterization
title_sort gene expression analysis for uterine cervix and corpus cancer characterization
topic uterine corpus cancer
cervical cancer
cervical adenocarcinoma
cervical squamous cell carcinoma
KnowSeq
RNA-Seq
url https://www.mdpi.com/2073-4425/15/3/312
work_keys_str_mv AT luciaalmorox geneexpressionanalysisforuterinecervixandcorpuscancercharacterization
AT lauraantequera geneexpressionanalysisforuterinecervixandcorpuscancercharacterization
AT ignaciorojas geneexpressionanalysisforuterinecervixandcorpuscancercharacterization
AT luisjavierherrera geneexpressionanalysisforuterinecervixandcorpuscancercharacterization
AT franciscomortuno geneexpressionanalysisforuterinecervixandcorpuscancercharacterization