Hypoxia classifier for transcriptome datasets

Abstract Molecular gene signatures are useful tools to characterize the physiological state of cell populations, but most have developed under a narrow range of conditions and cell types and are often restricted to a set of gene identities. Focusing on the transcriptional response to hypoxia, we aim...

Full description

Bibliographic Details
Main Authors: Laura Puente-Santamaría, Lucia Sanchez-Gonzalez, Ricardo Ramos-Ruiz, Luis del Peso
Format: Article
Language:English
Published: BMC 2022-05-01
Series:BMC Bioinformatics
Subjects:
Online Access:https://doi.org/10.1186/s12859-022-04741-8
_version_ 1817970571291394048
author Laura Puente-Santamaría
Lucia Sanchez-Gonzalez
Ricardo Ramos-Ruiz
Luis del Peso
author_facet Laura Puente-Santamaría
Lucia Sanchez-Gonzalez
Ricardo Ramos-Ruiz
Luis del Peso
author_sort Laura Puente-Santamaría
collection DOAJ
description Abstract Molecular gene signatures are useful tools to characterize the physiological state of cell populations, but most have developed under a narrow range of conditions and cell types and are often restricted to a set of gene identities. Focusing on the transcriptional response to hypoxia, we aimed to generate widely applicable classifiers sourced from the results of a meta-analysis of 69 differential expression datasets which included 425 individual RNA-seq experiments from 33 different human cell types exposed to different degrees of hypoxia (0.1–5% $$\hbox {O}_{2}$$ O 2 ) for 2–48 h. The resulting decision trees include both gene identities and quantitative boundaries, allowing for easy classification of individual samples without control or normoxic reference. Each tree is composed of 3–5 genes mostly drawn from a small set of just 8 genes (EGLN1, MIR210HG, NDRG1, ANKRD37, TCAF2, PFKFB3, BHLHE40, and MAFF). In spite of their simplicity, these classifiers achieve over 95% accuracy in cross validation and over 80% accuracy when applied to additional challenging datasets. Our results indicate that the classifiers are able to identify hypoxic tumor samples from bulk RNAseq and hypoxic regions within tumor from spatially resolved transcriptomics datasets. Moreover, application of the classifiers to histological sections from normal tissues suggest the presence of a hypoxic gene expression pattern in the kidney cortex not observed in other normoxic organs. Finally, tree classifiers described herein outperform traditional hypoxic gene signatures when compared against a wide range of datasets. This work describes a set of hypoxic gene signatures, structured as simple decision tress, that identify hypoxic samples and regions with high accuracy and can be applied to a broad variety of gene expression datasets and formats.
first_indexed 2024-04-13T20:35:52Z
format Article
id doaj.art-0ba84e4594084cc6ad00a29c716d54b6
institution Directory Open Access Journal
issn 1471-2105
language English
last_indexed 2024-04-13T20:35:52Z
publishDate 2022-05-01
publisher BMC
record_format Article
series BMC Bioinformatics
spelling doaj.art-0ba84e4594084cc6ad00a29c716d54b62022-12-22T02:31:02ZengBMCBMC Bioinformatics1471-21052022-05-0123111910.1186/s12859-022-04741-8Hypoxia classifier for transcriptome datasetsLaura Puente-Santamaría0Lucia Sanchez-Gonzalez1Ricardo Ramos-Ruiz2Luis del Peso3Departamento de Bioquímica, Universidad Autónoma de Madrid (UAM)Departamento de Bioquímica, Universidad Autónoma de Madrid (UAM)Genomics Unit Cantoblanco, Fundación Parque Científico de MadridDepartamento de Bioquímica, Universidad Autónoma de Madrid (UAM)Abstract Molecular gene signatures are useful tools to characterize the physiological state of cell populations, but most have developed under a narrow range of conditions and cell types and are often restricted to a set of gene identities. Focusing on the transcriptional response to hypoxia, we aimed to generate widely applicable classifiers sourced from the results of a meta-analysis of 69 differential expression datasets which included 425 individual RNA-seq experiments from 33 different human cell types exposed to different degrees of hypoxia (0.1–5% $$\hbox {O}_{2}$$ O 2 ) for 2–48 h. The resulting decision trees include both gene identities and quantitative boundaries, allowing for easy classification of individual samples without control or normoxic reference. Each tree is composed of 3–5 genes mostly drawn from a small set of just 8 genes (EGLN1, MIR210HG, NDRG1, ANKRD37, TCAF2, PFKFB3, BHLHE40, and MAFF). In spite of their simplicity, these classifiers achieve over 95% accuracy in cross validation and over 80% accuracy when applied to additional challenging datasets. Our results indicate that the classifiers are able to identify hypoxic tumor samples from bulk RNAseq and hypoxic regions within tumor from spatially resolved transcriptomics datasets. Moreover, application of the classifiers to histological sections from normal tissues suggest the presence of a hypoxic gene expression pattern in the kidney cortex not observed in other normoxic organs. Finally, tree classifiers described herein outperform traditional hypoxic gene signatures when compared against a wide range of datasets. This work describes a set of hypoxic gene signatures, structured as simple decision tress, that identify hypoxic samples and regions with high accuracy and can be applied to a broad variety of gene expression datasets and formats.https://doi.org/10.1186/s12859-022-04741-8Transcriptome classificationHypoxiaGene expressionRNA-seqSpatial transcriptomics
spellingShingle Laura Puente-Santamaría
Lucia Sanchez-Gonzalez
Ricardo Ramos-Ruiz
Luis del Peso
Hypoxia classifier for transcriptome datasets
BMC Bioinformatics
Transcriptome classification
Hypoxia
Gene expression
RNA-seq
Spatial transcriptomics
title Hypoxia classifier for transcriptome datasets
title_full Hypoxia classifier for transcriptome datasets
title_fullStr Hypoxia classifier for transcriptome datasets
title_full_unstemmed Hypoxia classifier for transcriptome datasets
title_short Hypoxia classifier for transcriptome datasets
title_sort hypoxia classifier for transcriptome datasets
topic Transcriptome classification
Hypoxia
Gene expression
RNA-seq
Spatial transcriptomics
url https://doi.org/10.1186/s12859-022-04741-8
work_keys_str_mv AT laurapuentesantamaria hypoxiaclassifierfortranscriptomedatasets
AT luciasanchezgonzalez hypoxiaclassifierfortranscriptomedatasets
AT ricardoramosruiz hypoxiaclassifierfortranscriptomedatasets
AT luisdelpeso hypoxiaclassifierfortranscriptomedatasets