Correlation-based joint feature screening for semi-competing risks outcomes with application to breast cancer data

Ultrahigh-dimensional gene features are often collected in modern cancer studies in which the number of gene features p is extremely larger than sample size n. While gene expression patterns have been shown to be related to patients' survival in microarray-based gene expression studies, one has...

Full description

Bibliographic Details
Main Authors: Peng, Mengjiao, Xiang, Liming
Other Authors: School of Physical and Mathematical Sciences
Format: Journal Article
Language:English
Published: 2022
Subjects:
Online Access:https://hdl.handle.net/10356/157027
_version_ 1811680661184970752
author Peng, Mengjiao
Xiang, Liming
author2 School of Physical and Mathematical Sciences
author_facet School of Physical and Mathematical Sciences
Peng, Mengjiao
Xiang, Liming
author_sort Peng, Mengjiao
collection NTU
description Ultrahigh-dimensional gene features are often collected in modern cancer studies in which the number of gene features p is extremely larger than sample size n. While gene expression patterns have been shown to be related to patients' survival in microarray-based gene expression studies, one has to deal with the challenges of ultrahigh-dimensional genetic predictors for survival predicting and genetic understanding of the disease in precision medicine. The problem becomes more complicated when two types of survival endpoints, distant metastasis-free survival and overall survival, are of interest in the study and outcome data can be subject to semi-competing risks due to the fact that distant metastasis-free survival is possibly censored by overall survival but not vice versa. Our focus in this paper is to extract important features, which have great impacts on both distant metastasis-free survival and overall survival jointly, from massive gene expression data in the semi-competing risks setting. We propose a model-free screening method based on the ranking of the correlation between gene features and the joint survival function of two endpoints. The method accounts for the relationship between two endpoints in a simply defined utility measure that is easy to understand and calculate. We show its favorable theoretical properties such as the sure screening and ranking consistency, and evaluate its finite sample performance through extensive simulation studies. Finally, an application to classifying breast cancer data clearly demonstrates the utility of the proposed method in practice.
first_indexed 2024-10-01T03:28:36Z
format Journal Article
id ntu-10356/157027
institution Nanyang Technological University
language English
last_indexed 2024-10-01T03:28:36Z
publishDate 2022
record_format dspace
spelling ntu-10356/1570272023-02-28T20:06:29Z Correlation-based joint feature screening for semi-competing risks outcomes with application to breast cancer data Peng, Mengjiao Xiang, Liming School of Physical and Mathematical Sciences Science::Mathematics Gene Expression Data Joint Survival Function Ultrahigh-dimensional gene features are often collected in modern cancer studies in which the number of gene features p is extremely larger than sample size n. While gene expression patterns have been shown to be related to patients' survival in microarray-based gene expression studies, one has to deal with the challenges of ultrahigh-dimensional genetic predictors for survival predicting and genetic understanding of the disease in precision medicine. The problem becomes more complicated when two types of survival endpoints, distant metastasis-free survival and overall survival, are of interest in the study and outcome data can be subject to semi-competing risks due to the fact that distant metastasis-free survival is possibly censored by overall survival but not vice versa. Our focus in this paper is to extract important features, which have great impacts on both distant metastasis-free survival and overall survival jointly, from massive gene expression data in the semi-competing risks setting. We propose a model-free screening method based on the ranking of the correlation between gene features and the joint survival function of two endpoints. The method accounts for the relationship between two endpoints in a simply defined utility measure that is easy to understand and calculate. We show its favorable theoretical properties such as the sure screening and ranking consistency, and evaluate its finite sample performance through extensive simulation studies. Finally, an application to classifying breast cancer data clearly demonstrates the utility of the proposed method in practice. Ministry of Education (MOE) Submitted/Accepted version Xiang’s research was supported by the Singapore Ministry of Education Academic Research Fund Tier 1 grant RG98/20 and Peng’s research was supported by National Natural Science Foundation of China (NSFC Grant No. 92046005). 2022-04-30T07:43:21Z 2022-04-30T07:43:21Z 2021 Journal Article Peng, M. & Xiang, L. (2021). Correlation-based joint feature screening for semi-competing risks outcomes with application to breast cancer data. Statistical Methods in Medical Research, 30(11), 2428-2446. https://dx.doi.org/10.1177/09622802211037071 0962-2802 https://hdl.handle.net/10356/157027 10.1177/09622802211037071 34519231 2-s2.0-85114854933 11 30 2428 2446 en RG98/20 Statistical Methods in Medical Research © 2021 The Author(s). All rights reserved. This paper was published in Statistical Methods in Medical Research and is made available with permission of The Author(s). application/pdf
spellingShingle Science::Mathematics
Gene Expression Data
Joint Survival Function
Peng, Mengjiao
Xiang, Liming
Correlation-based joint feature screening for semi-competing risks outcomes with application to breast cancer data
title Correlation-based joint feature screening for semi-competing risks outcomes with application to breast cancer data
title_full Correlation-based joint feature screening for semi-competing risks outcomes with application to breast cancer data
title_fullStr Correlation-based joint feature screening for semi-competing risks outcomes with application to breast cancer data
title_full_unstemmed Correlation-based joint feature screening for semi-competing risks outcomes with application to breast cancer data
title_short Correlation-based joint feature screening for semi-competing risks outcomes with application to breast cancer data
title_sort correlation based joint feature screening for semi competing risks outcomes with application to breast cancer data
topic Science::Mathematics
Gene Expression Data
Joint Survival Function
url https://hdl.handle.net/10356/157027
work_keys_str_mv AT pengmengjiao correlationbasedjointfeaturescreeningforsemicompetingrisksoutcomeswithapplicationtobreastcancerdata
AT xiangliming correlationbasedjointfeaturescreeningforsemicompetingrisksoutcomeswithapplicationtobreastcancerdata