Widespread redundancy in -omics profiles of cancer mutation states

Abstract Background In studies of cellular function in cancer, researchers are increasingly able to choose from many -omics assays as functional readouts. Choosing the correct readout for a given study can be difficult, and which layer of cellular function is most suitable to capture the relevant si...

Full description

Bibliographic Details
Main Authors: Jake Crawford, Brock C. Christensen, Maria Chikina, Casey S. Greene
Format: Article
Language:English
Published: BMC 2022-06-01
Series:Genome Biology
Online Access:https://doi.org/10.1186/s13059-022-02705-y
_version_ 1818235678109990912
author Jake Crawford
Brock C. Christensen
Maria Chikina
Casey S. Greene
author_facet Jake Crawford
Brock C. Christensen
Maria Chikina
Casey S. Greene
author_sort Jake Crawford
collection DOAJ
description Abstract Background In studies of cellular function in cancer, researchers are increasingly able to choose from many -omics assays as functional readouts. Choosing the correct readout for a given study can be difficult, and which layer of cellular function is most suitable to capture the relevant signal remains unclear. Results We consider prediction of cancer mutation status (presence or absence) from functional -omics data as a representative problem that presents an opportunity to quantify and compare the ability of different -omics readouts to capture signals of dysregulation in cancer. From the TCGA Pan-Cancer Atlas that contains genetic alteration data, we focus on RNA sequencing, DNA methylation arrays, reverse phase protein arrays (RPPA), microRNA, and somatic mutational signatures as -omics readouts. Across a collection of genes recurrently mutated in cancer, RNA sequencing tends to be the most effective predictor of mutation state. We find that one or more other data types for many of the genes are approximately equally effective predictors. Performance is more variable between mutations than that between data types for the same mutation, and there is little difference between the top data types. We also find that combining data types into a single multi-omics model provides little or no improvement in predictive ability over the best individual data type. Conclusions Based on our results, for the design of studies focused on the functional outcomes of cancer mutations, there are often multiple -omics types that can serve as effective readouts, although gene expression seems to be a reasonable default option.
first_indexed 2024-12-12T11:57:47Z
format Article
id doaj.art-b037729a8d594e50b35164bc440f5699
institution Directory Open Access Journal
issn 1474-760X
language English
last_indexed 2024-12-12T11:57:47Z
publishDate 2022-06-01
publisher BMC
record_format Article
series Genome Biology
spelling doaj.art-b037729a8d594e50b35164bc440f56992022-12-22T00:25:10ZengBMCGenome Biology1474-760X2022-06-0123112410.1186/s13059-022-02705-yWidespread redundancy in -omics profiles of cancer mutation statesJake Crawford0Brock C. Christensen1Maria Chikina2Casey S. Greene3Genomics and Computational Biology Graduate Group, Perelman School of Medicine, University of PennsylvaniaDepartment of Epidemiology, Geisel School of Medicine, Dartmouth CollegeDepartment of Computational and Systems Biology, School of Medicine, University of PittsburghDepartment of Biochemistry and Molecular Genetics, University of Colorado School of MedicineAbstract Background In studies of cellular function in cancer, researchers are increasingly able to choose from many -omics assays as functional readouts. Choosing the correct readout for a given study can be difficult, and which layer of cellular function is most suitable to capture the relevant signal remains unclear. Results We consider prediction of cancer mutation status (presence or absence) from functional -omics data as a representative problem that presents an opportunity to quantify and compare the ability of different -omics readouts to capture signals of dysregulation in cancer. From the TCGA Pan-Cancer Atlas that contains genetic alteration data, we focus on RNA sequencing, DNA methylation arrays, reverse phase protein arrays (RPPA), microRNA, and somatic mutational signatures as -omics readouts. Across a collection of genes recurrently mutated in cancer, RNA sequencing tends to be the most effective predictor of mutation state. We find that one or more other data types for many of the genes are approximately equally effective predictors. Performance is more variable between mutations than that between data types for the same mutation, and there is little difference between the top data types. We also find that combining data types into a single multi-omics model provides little or no improvement in predictive ability over the best individual data type. Conclusions Based on our results, for the design of studies focused on the functional outcomes of cancer mutations, there are often multiple -omics types that can serve as effective readouts, although gene expression seems to be a reasonable default option.https://doi.org/10.1186/s13059-022-02705-y
spellingShingle Jake Crawford
Brock C. Christensen
Maria Chikina
Casey S. Greene
Widespread redundancy in -omics profiles of cancer mutation states
Genome Biology
title Widespread redundancy in -omics profiles of cancer mutation states
title_full Widespread redundancy in -omics profiles of cancer mutation states
title_fullStr Widespread redundancy in -omics profiles of cancer mutation states
title_full_unstemmed Widespread redundancy in -omics profiles of cancer mutation states
title_short Widespread redundancy in -omics profiles of cancer mutation states
title_sort widespread redundancy in omics profiles of cancer mutation states
url https://doi.org/10.1186/s13059-022-02705-y
work_keys_str_mv AT jakecrawford widespreadredundancyinomicsprofilesofcancermutationstates
AT brockcchristensen widespreadredundancyinomicsprofilesofcancermutationstates
AT mariachikina widespreadredundancyinomicsprofilesofcancermutationstates
AT caseysgreene widespreadredundancyinomicsprofilesofcancermutationstates