Multiomics Topic Modeling for Breast Cancer Classification

The integration of transcriptional data with other layers of information, such as the post-transcriptional regulation mediated by microRNAs, can be crucial to identify the driver genes and the subtypes of complex and heterogeneous diseases such as cancer. This paper presents an approach based on top...

Full description

Bibliographic Details
Main Authors: Filippo Valle, Matteo Osella, Michele Caselle
Format: Article
Language:English
Published: MDPI AG 2022-02-01
Series:Cancers
Subjects:
Online Access:https://www.mdpi.com/2072-6694/14/5/1150
_version_ 1797475581794713600
author Filippo Valle
Matteo Osella
Michele Caselle
author_facet Filippo Valle
Matteo Osella
Michele Caselle
author_sort Filippo Valle
collection DOAJ
description The integration of transcriptional data with other layers of information, such as the post-transcriptional regulation mediated by microRNAs, can be crucial to identify the driver genes and the subtypes of complex and heterogeneous diseases such as cancer. This paper presents an approach based on topic modeling to accomplish this integration task. More specifically, we show how an algorithm based on a hierarchical version of stochastic block modeling can be naturally extended to integrate any combination of ’omics data. We test this approach on breast cancer samples from the TCGA database, integrating data on messenger RNA, microRNAs, and copy number variations. We show that the inclusion of the microRNA layer significantly improves the accuracy of subtype classification. Moreover, some of the hidden structures or “topics” that the algorithm extracts actually correspond to genes and microRNAs involved in breast cancer development and are associated to the survival probability.
first_indexed 2024-03-09T20:46:10Z
format Article
id doaj.art-ea4c1361f2cd4160a7872b24e3f22c5b
institution Directory Open Access Journal
issn 2072-6694
language English
last_indexed 2024-03-09T20:46:10Z
publishDate 2022-02-01
publisher MDPI AG
record_format Article
series Cancers
spelling doaj.art-ea4c1361f2cd4160a7872b24e3f22c5b2023-11-23T22:46:38ZengMDPI AGCancers2072-66942022-02-01145115010.3390/cancers14051150Multiomics Topic Modeling for Breast Cancer ClassificationFilippo Valle0Matteo Osella1Michele Caselle2Physics Department, University of Turin and INFN, via P. Giuria 1, 10125 Turin, ItalyPhysics Department, University of Turin and INFN, via P. Giuria 1, 10125 Turin, ItalyPhysics Department, University of Turin and INFN, via P. Giuria 1, 10125 Turin, ItalyThe integration of transcriptional data with other layers of information, such as the post-transcriptional regulation mediated by microRNAs, can be crucial to identify the driver genes and the subtypes of complex and heterogeneous diseases such as cancer. This paper presents an approach based on topic modeling to accomplish this integration task. More specifically, we show how an algorithm based on a hierarchical version of stochastic block modeling can be naturally extended to integrate any combination of ’omics data. We test this approach on breast cancer samples from the TCGA database, integrating data on messenger RNA, microRNAs, and copy number variations. We show that the inclusion of the microRNA layer significantly improves the accuracy of subtype classification. Moreover, some of the hidden structures or “topics” that the algorithm extracts actually correspond to genes and microRNAs involved in breast cancer development and are associated to the survival probability.https://www.mdpi.com/2072-6694/14/5/1150miRNAsmiRNA expression regulationtopic modelingstochastic block modelingmultiomicschr14q32
spellingShingle Filippo Valle
Matteo Osella
Michele Caselle
Multiomics Topic Modeling for Breast Cancer Classification
Cancers
miRNAs
miRNA expression regulation
topic modeling
stochastic block modeling
multiomics
chr14q32
title Multiomics Topic Modeling for Breast Cancer Classification
title_full Multiomics Topic Modeling for Breast Cancer Classification
title_fullStr Multiomics Topic Modeling for Breast Cancer Classification
title_full_unstemmed Multiomics Topic Modeling for Breast Cancer Classification
title_short Multiomics Topic Modeling for Breast Cancer Classification
title_sort multiomics topic modeling for breast cancer classification
topic miRNAs
miRNA expression regulation
topic modeling
stochastic block modeling
multiomics
chr14q32
url https://www.mdpi.com/2072-6694/14/5/1150
work_keys_str_mv AT filippovalle multiomicstopicmodelingforbreastcancerclassification
AT matteoosella multiomicstopicmodelingforbreastcancerclassification
AT michelecaselle multiomicstopicmodelingforbreastcancerclassification