A multimodal graph neural network framework for cancer molecular subtype classification
Abstract Background The recent development of high-throughput sequencing has created a large collection of multi-omics data, which enables researchers to better investigate cancer molecular profiles and cancer taxonomy based on molecular subtypes. Integrating multi-omics data has been proven to be e...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2024-01-01
|
Series: | BMC Bioinformatics |
Subjects: | |
Online Access: | https://doi.org/10.1186/s12859-023-05622-4 |
_version_ | 1797349663133663232 |
---|---|
author | Bingjun Li Sheida Nabavi |
author_facet | Bingjun Li Sheida Nabavi |
author_sort | Bingjun Li |
collection | DOAJ |
description | Abstract Background The recent development of high-throughput sequencing has created a large collection of multi-omics data, which enables researchers to better investigate cancer molecular profiles and cancer taxonomy based on molecular subtypes. Integrating multi-omics data has been proven to be effective for building more precise classification models. Most current multi-omics integrative models use either an early fusion in the form of concatenation or late fusion with a separate feature extractor for each omic, which are mainly based on deep neural networks. Due to the nature of biological systems, graphs are a better structural representation of bio-medical data. Although few graph neural network (GNN) based multi-omics integrative methods have been proposed, they suffer from three common disadvantages. One is most of them use only one type of connection, either inter-omics or intra-omic connection; second, they only consider one kind of GNN layer, either graph convolution network (GCN) or graph attention network (GAT); and third, most of these methods have not been tested on a more complex classification task, such as cancer molecular subtypes. Results In this study, we propose a novel end-to-end multi-omics GNN framework for accurate and robust cancer subtype classification. The proposed model utilizes multi-omics data in the form of heterogeneous multi-layer graphs, which combine both inter-omics and intra-omic connections from established biological knowledge. The proposed model incorporates learned graph features and global genome features for accurate classification. We tested the proposed model on the Cancer Genome Atlas (TCGA) Pan-cancer dataset and TCGA breast invasive carcinoma (BRCA) dataset for molecular subtype and cancer subtype classification, respectively. The proposed model shows superior performance compared to four current state-of-the-art baseline models in terms of accuracy, F1 score, precision, and recall. The comparative analysis of GAT-based models and GCN-based models reveals that GAT-based models are preferred for smaller graphs with less information and GCN-based models are preferred for larger graphs with extra information. |
first_indexed | 2024-03-08T12:33:38Z |
format | Article |
id | doaj.art-98d14d666f8243a6bfde9a2f2e7fbf48 |
institution | Directory Open Access Journal |
issn | 1471-2105 |
language | English |
last_indexed | 2024-03-08T12:33:38Z |
publishDate | 2024-01-01 |
publisher | BMC |
record_format | Article |
series | BMC Bioinformatics |
spelling | doaj.art-98d14d666f8243a6bfde9a2f2e7fbf482024-01-21T12:37:35ZengBMCBMC Bioinformatics1471-21052024-01-0125111910.1186/s12859-023-05622-4A multimodal graph neural network framework for cancer molecular subtype classificationBingjun Li0Sheida Nabavi1Department of Computer Science and Engineering, University of ConnecticutDepartment of Computer Science and Engineering, University of ConnecticutAbstract Background The recent development of high-throughput sequencing has created a large collection of multi-omics data, which enables researchers to better investigate cancer molecular profiles and cancer taxonomy based on molecular subtypes. Integrating multi-omics data has been proven to be effective for building more precise classification models. Most current multi-omics integrative models use either an early fusion in the form of concatenation or late fusion with a separate feature extractor for each omic, which are mainly based on deep neural networks. Due to the nature of biological systems, graphs are a better structural representation of bio-medical data. Although few graph neural network (GNN) based multi-omics integrative methods have been proposed, they suffer from three common disadvantages. One is most of them use only one type of connection, either inter-omics or intra-omic connection; second, they only consider one kind of GNN layer, either graph convolution network (GCN) or graph attention network (GAT); and third, most of these methods have not been tested on a more complex classification task, such as cancer molecular subtypes. Results In this study, we propose a novel end-to-end multi-omics GNN framework for accurate and robust cancer subtype classification. The proposed model utilizes multi-omics data in the form of heterogeneous multi-layer graphs, which combine both inter-omics and intra-omic connections from established biological knowledge. The proposed model incorporates learned graph features and global genome features for accurate classification. We tested the proposed model on the Cancer Genome Atlas (TCGA) Pan-cancer dataset and TCGA breast invasive carcinoma (BRCA) dataset for molecular subtype and cancer subtype classification, respectively. The proposed model shows superior performance compared to four current state-of-the-art baseline models in terms of accuracy, F1 score, precision, and recall. The comparative analysis of GAT-based models and GCN-based models reveals that GAT-based models are preferred for smaller graphs with less information and GCN-based models are preferred for larger graphs with extra information.https://doi.org/10.1186/s12859-023-05622-4Graph attention networkMulti-omics integrationCancer subtypeMolecular subtype |
spellingShingle | Bingjun Li Sheida Nabavi A multimodal graph neural network framework for cancer molecular subtype classification BMC Bioinformatics Graph attention network Multi-omics integration Cancer subtype Molecular subtype |
title | A multimodal graph neural network framework for cancer molecular subtype classification |
title_full | A multimodal graph neural network framework for cancer molecular subtype classification |
title_fullStr | A multimodal graph neural network framework for cancer molecular subtype classification |
title_full_unstemmed | A multimodal graph neural network framework for cancer molecular subtype classification |
title_short | A multimodal graph neural network framework for cancer molecular subtype classification |
title_sort | multimodal graph neural network framework for cancer molecular subtype classification |
topic | Graph attention network Multi-omics integration Cancer subtype Molecular subtype |
url | https://doi.org/10.1186/s12859-023-05622-4 |
work_keys_str_mv | AT bingjunli amultimodalgraphneuralnetworkframeworkforcancermolecularsubtypeclassification AT sheidanabavi amultimodalgraphneuralnetworkframeworkforcancermolecularsubtypeclassification AT bingjunli multimodalgraphneuralnetworkframeworkforcancermolecularsubtypeclassification AT sheidanabavi multimodalgraphneuralnetworkframeworkforcancermolecularsubtypeclassification |