Hypoxia classifier for transcriptome datasets
Abstract Molecular gene signatures are useful tools to characterize the physiological state of cell populations, but most have developed under a narrow range of conditions and cell types and are often restricted to a set of gene identities. Focusing on the transcriptional response to hypoxia, we aim...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2022-05-01
|
Series: | BMC Bioinformatics |
Subjects: | |
Online Access: | https://doi.org/10.1186/s12859-022-04741-8 |
_version_ | 1817970571291394048 |
---|---|
author | Laura Puente-Santamaría Lucia Sanchez-Gonzalez Ricardo Ramos-Ruiz Luis del Peso |
author_facet | Laura Puente-Santamaría Lucia Sanchez-Gonzalez Ricardo Ramos-Ruiz Luis del Peso |
author_sort | Laura Puente-Santamaría |
collection | DOAJ |
description | Abstract Molecular gene signatures are useful tools to characterize the physiological state of cell populations, but most have developed under a narrow range of conditions and cell types and are often restricted to a set of gene identities. Focusing on the transcriptional response to hypoxia, we aimed to generate widely applicable classifiers sourced from the results of a meta-analysis of 69 differential expression datasets which included 425 individual RNA-seq experiments from 33 different human cell types exposed to different degrees of hypoxia (0.1–5% $$\hbox {O}_{2}$$ O 2 ) for 2–48 h. The resulting decision trees include both gene identities and quantitative boundaries, allowing for easy classification of individual samples without control or normoxic reference. Each tree is composed of 3–5 genes mostly drawn from a small set of just 8 genes (EGLN1, MIR210HG, NDRG1, ANKRD37, TCAF2, PFKFB3, BHLHE40, and MAFF). In spite of their simplicity, these classifiers achieve over 95% accuracy in cross validation and over 80% accuracy when applied to additional challenging datasets. Our results indicate that the classifiers are able to identify hypoxic tumor samples from bulk RNAseq and hypoxic regions within tumor from spatially resolved transcriptomics datasets. Moreover, application of the classifiers to histological sections from normal tissues suggest the presence of a hypoxic gene expression pattern in the kidney cortex not observed in other normoxic organs. Finally, tree classifiers described herein outperform traditional hypoxic gene signatures when compared against a wide range of datasets. This work describes a set of hypoxic gene signatures, structured as simple decision tress, that identify hypoxic samples and regions with high accuracy and can be applied to a broad variety of gene expression datasets and formats. |
first_indexed | 2024-04-13T20:35:52Z |
format | Article |
id | doaj.art-0ba84e4594084cc6ad00a29c716d54b6 |
institution | Directory Open Access Journal |
issn | 1471-2105 |
language | English |
last_indexed | 2024-04-13T20:35:52Z |
publishDate | 2022-05-01 |
publisher | BMC |
record_format | Article |
series | BMC Bioinformatics |
spelling | doaj.art-0ba84e4594084cc6ad00a29c716d54b62022-12-22T02:31:02ZengBMCBMC Bioinformatics1471-21052022-05-0123111910.1186/s12859-022-04741-8Hypoxia classifier for transcriptome datasetsLaura Puente-Santamaría0Lucia Sanchez-Gonzalez1Ricardo Ramos-Ruiz2Luis del Peso3Departamento de Bioquímica, Universidad Autónoma de Madrid (UAM)Departamento de Bioquímica, Universidad Autónoma de Madrid (UAM)Genomics Unit Cantoblanco, Fundación Parque Científico de MadridDepartamento de Bioquímica, Universidad Autónoma de Madrid (UAM)Abstract Molecular gene signatures are useful tools to characterize the physiological state of cell populations, but most have developed under a narrow range of conditions and cell types and are often restricted to a set of gene identities. Focusing on the transcriptional response to hypoxia, we aimed to generate widely applicable classifiers sourced from the results of a meta-analysis of 69 differential expression datasets which included 425 individual RNA-seq experiments from 33 different human cell types exposed to different degrees of hypoxia (0.1–5% $$\hbox {O}_{2}$$ O 2 ) for 2–48 h. The resulting decision trees include both gene identities and quantitative boundaries, allowing for easy classification of individual samples without control or normoxic reference. Each tree is composed of 3–5 genes mostly drawn from a small set of just 8 genes (EGLN1, MIR210HG, NDRG1, ANKRD37, TCAF2, PFKFB3, BHLHE40, and MAFF). In spite of their simplicity, these classifiers achieve over 95% accuracy in cross validation and over 80% accuracy when applied to additional challenging datasets. Our results indicate that the classifiers are able to identify hypoxic tumor samples from bulk RNAseq and hypoxic regions within tumor from spatially resolved transcriptomics datasets. Moreover, application of the classifiers to histological sections from normal tissues suggest the presence of a hypoxic gene expression pattern in the kidney cortex not observed in other normoxic organs. Finally, tree classifiers described herein outperform traditional hypoxic gene signatures when compared against a wide range of datasets. This work describes a set of hypoxic gene signatures, structured as simple decision tress, that identify hypoxic samples and regions with high accuracy and can be applied to a broad variety of gene expression datasets and formats.https://doi.org/10.1186/s12859-022-04741-8Transcriptome classificationHypoxiaGene expressionRNA-seqSpatial transcriptomics |
spellingShingle | Laura Puente-Santamaría Lucia Sanchez-Gonzalez Ricardo Ramos-Ruiz Luis del Peso Hypoxia classifier for transcriptome datasets BMC Bioinformatics Transcriptome classification Hypoxia Gene expression RNA-seq Spatial transcriptomics |
title | Hypoxia classifier for transcriptome datasets |
title_full | Hypoxia classifier for transcriptome datasets |
title_fullStr | Hypoxia classifier for transcriptome datasets |
title_full_unstemmed | Hypoxia classifier for transcriptome datasets |
title_short | Hypoxia classifier for transcriptome datasets |
title_sort | hypoxia classifier for transcriptome datasets |
topic | Transcriptome classification Hypoxia Gene expression RNA-seq Spatial transcriptomics |
url | https://doi.org/10.1186/s12859-022-04741-8 |
work_keys_str_mv | AT laurapuentesantamaria hypoxiaclassifierfortranscriptomedatasets AT luciasanchezgonzalez hypoxiaclassifierfortranscriptomedatasets AT ricardoramosruiz hypoxiaclassifierfortranscriptomedatasets AT luisdelpeso hypoxiaclassifierfortranscriptomedatasets |