Feature screening for survival trait with application to TCGA high-dimensional genomic data

Background In high-dimensional survival genomic data, identifying cancer-related genes is a challenging and important subject in the field of bioinformatics. In recent years, many feature screening approaches for survival outcomes with high-dimensional survival genomic data have been developed; howe...

Ful tanımlama

Detaylı Bibliyografya
Asıl Yazarlar: Jie-Huei Wang, Cai-Rong Li, Po-Lin Hou
Materyal Türü: Makale
Dil:English
Baskı/Yayın Bilgisi: PeerJ Inc. 2022-03-01
Seri Bilgileri:PeerJ
Konular:
Online Erişim:https://peerj.com/articles/13098.pdf
_version_ 1827607379610435584
author Jie-Huei Wang
Cai-Rong Li
Po-Lin Hou
author_facet Jie-Huei Wang
Cai-Rong Li
Po-Lin Hou
author_sort Jie-Huei Wang
collection DOAJ
description Background In high-dimensional survival genomic data, identifying cancer-related genes is a challenging and important subject in the field of bioinformatics. In recent years, many feature screening approaches for survival outcomes with high-dimensional survival genomic data have been developed; however, few studies have systematically compared these methods. The primary purpose of this article is to conduct a series of simulation studies for systematic comparison; the second purpose of this article is to use these feature screening methods to further establish a more accurate prediction model for patient survival based on the survival genomic datasets of The Cancer Genome Atlas (TCGA). Results Simulation studies prove that network-adjusted feature screening measurement performs well and outperforms existing popular univariate independent feature screening methods. In the application of real data, we show that the proposed network-adjusted feature screening approach leads to more accurate survival prediction than alternative methods that do not account for gene-gene dependency information. We also use TCGA clinical survival genetic data to identify biomarkers associated with clinical survival outcomes in patients with various cancers including esophageal, pancreatic, head and neck squamous cell, lung, and breast invasive carcinomas. Conclusions These applications reveal advantages of the new proposed network-adjusted feature selection method over alternative methods that do not consider gene-gene dependency information. We also identify cancer-related genes that are almost detected in the literature. As a result, the network-based screening method is reliable and credible.
first_indexed 2024-03-09T06:53:37Z
format Article
id doaj.art-df3be369b8a441a986c5b1ff13bbdfbc
institution Directory Open Access Journal
issn 2167-8359
language English
last_indexed 2024-03-09T06:53:37Z
publishDate 2022-03-01
publisher PeerJ Inc.
record_format Article
series PeerJ
spelling doaj.art-df3be369b8a441a986c5b1ff13bbdfbc2023-12-03T10:16:11ZengPeerJ Inc.PeerJ2167-83592022-03-0110e1309810.7717/peerj.13098Feature screening for survival trait with application to TCGA high-dimensional genomic dataJie-Huei WangCai-Rong LiPo-Lin HouBackground In high-dimensional survival genomic data, identifying cancer-related genes is a challenging and important subject in the field of bioinformatics. In recent years, many feature screening approaches for survival outcomes with high-dimensional survival genomic data have been developed; however, few studies have systematically compared these methods. The primary purpose of this article is to conduct a series of simulation studies for systematic comparison; the second purpose of this article is to use these feature screening methods to further establish a more accurate prediction model for patient survival based on the survival genomic datasets of The Cancer Genome Atlas (TCGA). Results Simulation studies prove that network-adjusted feature screening measurement performs well and outperforms existing popular univariate independent feature screening methods. In the application of real data, we show that the proposed network-adjusted feature screening approach leads to more accurate survival prediction than alternative methods that do not account for gene-gene dependency information. We also use TCGA clinical survival genetic data to identify biomarkers associated with clinical survival outcomes in patients with various cancers including esophageal, pancreatic, head and neck squamous cell, lung, and breast invasive carcinomas. Conclusions These applications reveal advantages of the new proposed network-adjusted feature selection method over alternative methods that do not consider gene-gene dependency information. We also identify cancer-related genes that are almost detected in the literature. As a result, the network-based screening method is reliable and credible.https://peerj.com/articles/13098.pdfSurvival feature screeningHigh-dimensional genomic dataNetworkSurvival predictionTCGAEsophageal cancer
spellingShingle Jie-Huei Wang
Cai-Rong Li
Po-Lin Hou
Feature screening for survival trait with application to TCGA high-dimensional genomic data
PeerJ
Survival feature screening
High-dimensional genomic data
Network
Survival prediction
TCGA
Esophageal cancer
title Feature screening for survival trait with application to TCGA high-dimensional genomic data
title_full Feature screening for survival trait with application to TCGA high-dimensional genomic data
title_fullStr Feature screening for survival trait with application to TCGA high-dimensional genomic data
title_full_unstemmed Feature screening for survival trait with application to TCGA high-dimensional genomic data
title_short Feature screening for survival trait with application to TCGA high-dimensional genomic data
title_sort feature screening for survival trait with application to tcga high dimensional genomic data
topic Survival feature screening
High-dimensional genomic data
Network
Survival prediction
TCGA
Esophageal cancer
url https://peerj.com/articles/13098.pdf
work_keys_str_mv AT jiehueiwang featurescreeningforsurvivaltraitwithapplicationtotcgahighdimensionalgenomicdata
AT cairongli featurescreeningforsurvivaltraitwithapplicationtotcgahighdimensionalgenomicdata
AT polinhou featurescreeningforsurvivaltraitwithapplicationtotcgahighdimensionalgenomicdata