Min-redundancy and max-relevance multi-view feature selection for predicting ovarian cancer survival using multi-omics data

Abstract Background Large-scale collaborative precision medicine initiatives (e.g., The Cancer Genome Atlas (TCGA)) are yielding rich multi-omics data. Integrative analyses of the resulting multi-omics data, such as somatic mutation, copy number alteration (CNA), DNA methylation, miRNA, gene express...

Full description

Bibliographic Details
Main Authors: Yasser EL-Manzalawy, Tsung-Yu Hsieh, Manu Shivakumar, Dokyoon Kim, Vasant Honavar
Format: Article
Language:English
Published: BMC 2018-09-01
Series:BMC Medical Genomics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12920-018-0388-0
_version_ 1818728259477569536
author Yasser EL-Manzalawy
Tsung-Yu Hsieh
Manu Shivakumar
Dokyoon Kim
Vasant Honavar
author_facet Yasser EL-Manzalawy
Tsung-Yu Hsieh
Manu Shivakumar
Dokyoon Kim
Vasant Honavar
author_sort Yasser EL-Manzalawy
collection DOAJ
description Abstract Background Large-scale collaborative precision medicine initiatives (e.g., The Cancer Genome Atlas (TCGA)) are yielding rich multi-omics data. Integrative analyses of the resulting multi-omics data, such as somatic mutation, copy number alteration (CNA), DNA methylation, miRNA, gene expression, and protein expression, offer tantalizing possibilities for realizing the promise and potential of precision medicine in cancer prevention, diagnosis, and treatment by substantially improving our understanding of underlying mechanisms as well as the discovery of novel biomarkers for different types of cancers. However, such analyses present a number of challenges, including heterogeneity, and high-dimensionality of omics data. Methods We propose a novel framework for multi-omics data integration using multi-view feature selection. We introduce a novel multi-view feature selection algorithm, MRMR-mv, an adaptation of the well-known Min-Redundancy and Maximum-Relevance (MRMR) single-view feature selection algorithm to the multi-view setting. Results We report results of experiments using an ovarian cancer multi-omics dataset derived from the TCGA database on the task of predicting ovarian cancer survival. Our results suggest that multi-view models outperform both view-specific models (i.e., models trained and tested using a single type of omics data) and models based on two baseline data fusion methods. Conclusions Our results demonstrate the potential of multi-view feature selection in integrative analyses and predictive modeling from multi-omics data.
first_indexed 2024-12-17T22:27:09Z
format Article
id doaj.art-28bc795645b940629925263fac0cda00
institution Directory Open Access Journal
issn 1755-8794
language English
last_indexed 2024-12-17T22:27:09Z
publishDate 2018-09-01
publisher BMC
record_format Article
series BMC Medical Genomics
spelling doaj.art-28bc795645b940629925263fac0cda002022-12-21T21:30:18ZengBMCBMC Medical Genomics1755-87942018-09-0111S3193110.1186/s12920-018-0388-0Min-redundancy and max-relevance multi-view feature selection for predicting ovarian cancer survival using multi-omics dataYasser EL-Manzalawy0Tsung-Yu Hsieh1Manu Shivakumar2Dokyoon Kim3Vasant Honavar4Artificial Intelligence Research Laboratory, College of Information Sciences and Technology, Pennsylvania State UniversityArtificial Intelligence Research Laboratory, College of Information Sciences and Technology, Pennsylvania State UniversityBiomedical and Translational Informatics Institute, Geisinger Health SystemBiomedical and Translational Informatics Institute, Geisinger Health SystemArtificial Intelligence Research Laboratory, College of Information Sciences and Technology, Pennsylvania State UniversityAbstract Background Large-scale collaborative precision medicine initiatives (e.g., The Cancer Genome Atlas (TCGA)) are yielding rich multi-omics data. Integrative analyses of the resulting multi-omics data, such as somatic mutation, copy number alteration (CNA), DNA methylation, miRNA, gene expression, and protein expression, offer tantalizing possibilities for realizing the promise and potential of precision medicine in cancer prevention, diagnosis, and treatment by substantially improving our understanding of underlying mechanisms as well as the discovery of novel biomarkers for different types of cancers. However, such analyses present a number of challenges, including heterogeneity, and high-dimensionality of omics data. Methods We propose a novel framework for multi-omics data integration using multi-view feature selection. We introduce a novel multi-view feature selection algorithm, MRMR-mv, an adaptation of the well-known Min-Redundancy and Maximum-Relevance (MRMR) single-view feature selection algorithm to the multi-view setting. Results We report results of experiments using an ovarian cancer multi-omics dataset derived from the TCGA database on the task of predicting ovarian cancer survival. Our results suggest that multi-view models outperform both view-specific models (i.e., models trained and tested using a single type of omics data) and models based on two baseline data fusion methods. Conclusions Our results demonstrate the potential of multi-view feature selection in integrative analyses and predictive modeling from multi-omics data.http://link.springer.com/article/10.1186/s12920-018-0388-0Multi-omics data integrationMulti-view feature selectionCancer survival predictionMachine learning
spellingShingle Yasser EL-Manzalawy
Tsung-Yu Hsieh
Manu Shivakumar
Dokyoon Kim
Vasant Honavar
Min-redundancy and max-relevance multi-view feature selection for predicting ovarian cancer survival using multi-omics data
BMC Medical Genomics
Multi-omics data integration
Multi-view feature selection
Cancer survival prediction
Machine learning
title Min-redundancy and max-relevance multi-view feature selection for predicting ovarian cancer survival using multi-omics data
title_full Min-redundancy and max-relevance multi-view feature selection for predicting ovarian cancer survival using multi-omics data
title_fullStr Min-redundancy and max-relevance multi-view feature selection for predicting ovarian cancer survival using multi-omics data
title_full_unstemmed Min-redundancy and max-relevance multi-view feature selection for predicting ovarian cancer survival using multi-omics data
title_short Min-redundancy and max-relevance multi-view feature selection for predicting ovarian cancer survival using multi-omics data
title_sort min redundancy and max relevance multi view feature selection for predicting ovarian cancer survival using multi omics data
topic Multi-omics data integration
Multi-view feature selection
Cancer survival prediction
Machine learning
url http://link.springer.com/article/10.1186/s12920-018-0388-0
work_keys_str_mv AT yasserelmanzalawy minredundancyandmaxrelevancemultiviewfeatureselectionforpredictingovariancancersurvivalusingmultiomicsdata
AT tsungyuhsieh minredundancyandmaxrelevancemultiviewfeatureselectionforpredictingovariancancersurvivalusingmultiomicsdata
AT manushivakumar minredundancyandmaxrelevancemultiviewfeatureselectionforpredictingovariancancersurvivalusingmultiomicsdata
AT dokyoonkim minredundancyandmaxrelevancemultiviewfeatureselectionforpredictingovariancancersurvivalusingmultiomicsdata
AT vasanthonavar minredundancyandmaxrelevancemultiviewfeatureselectionforpredictingovariancancersurvivalusingmultiomicsdata