Multiomics Topic Modeling for Breast Cancer Classification
The integration of transcriptional data with other layers of information, such as the post-transcriptional regulation mediated by microRNAs, can be crucial to identify the driver genes and the subtypes of complex and heterogeneous diseases such as cancer. This paper presents an approach based on top...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-02-01
|
Series: | Cancers |
Subjects: | |
Online Access: | https://www.mdpi.com/2072-6694/14/5/1150 |
_version_ | 1797475581794713600 |
---|---|
author | Filippo Valle Matteo Osella Michele Caselle |
author_facet | Filippo Valle Matteo Osella Michele Caselle |
author_sort | Filippo Valle |
collection | DOAJ |
description | The integration of transcriptional data with other layers of information, such as the post-transcriptional regulation mediated by microRNAs, can be crucial to identify the driver genes and the subtypes of complex and heterogeneous diseases such as cancer. This paper presents an approach based on topic modeling to accomplish this integration task. More specifically, we show how an algorithm based on a hierarchical version of stochastic block modeling can be naturally extended to integrate any combination of ’omics data. We test this approach on breast cancer samples from the TCGA database, integrating data on messenger RNA, microRNAs, and copy number variations. We show that the inclusion of the microRNA layer significantly improves the accuracy of subtype classification. Moreover, some of the hidden structures or “topics” that the algorithm extracts actually correspond to genes and microRNAs involved in breast cancer development and are associated to the survival probability. |
first_indexed | 2024-03-09T20:46:10Z |
format | Article |
id | doaj.art-ea4c1361f2cd4160a7872b24e3f22c5b |
institution | Directory Open Access Journal |
issn | 2072-6694 |
language | English |
last_indexed | 2024-03-09T20:46:10Z |
publishDate | 2022-02-01 |
publisher | MDPI AG |
record_format | Article |
series | Cancers |
spelling | doaj.art-ea4c1361f2cd4160a7872b24e3f22c5b2023-11-23T22:46:38ZengMDPI AGCancers2072-66942022-02-01145115010.3390/cancers14051150Multiomics Topic Modeling for Breast Cancer ClassificationFilippo Valle0Matteo Osella1Michele Caselle2Physics Department, University of Turin and INFN, via P. Giuria 1, 10125 Turin, ItalyPhysics Department, University of Turin and INFN, via P. Giuria 1, 10125 Turin, ItalyPhysics Department, University of Turin and INFN, via P. Giuria 1, 10125 Turin, ItalyThe integration of transcriptional data with other layers of information, such as the post-transcriptional regulation mediated by microRNAs, can be crucial to identify the driver genes and the subtypes of complex and heterogeneous diseases such as cancer. This paper presents an approach based on topic modeling to accomplish this integration task. More specifically, we show how an algorithm based on a hierarchical version of stochastic block modeling can be naturally extended to integrate any combination of ’omics data. We test this approach on breast cancer samples from the TCGA database, integrating data on messenger RNA, microRNAs, and copy number variations. We show that the inclusion of the microRNA layer significantly improves the accuracy of subtype classification. Moreover, some of the hidden structures or “topics” that the algorithm extracts actually correspond to genes and microRNAs involved in breast cancer development and are associated to the survival probability.https://www.mdpi.com/2072-6694/14/5/1150miRNAsmiRNA expression regulationtopic modelingstochastic block modelingmultiomicschr14q32 |
spellingShingle | Filippo Valle Matteo Osella Michele Caselle Multiomics Topic Modeling for Breast Cancer Classification Cancers miRNAs miRNA expression regulation topic modeling stochastic block modeling multiomics chr14q32 |
title | Multiomics Topic Modeling for Breast Cancer Classification |
title_full | Multiomics Topic Modeling for Breast Cancer Classification |
title_fullStr | Multiomics Topic Modeling for Breast Cancer Classification |
title_full_unstemmed | Multiomics Topic Modeling for Breast Cancer Classification |
title_short | Multiomics Topic Modeling for Breast Cancer Classification |
title_sort | multiomics topic modeling for breast cancer classification |
topic | miRNAs miRNA expression regulation topic modeling stochastic block modeling multiomics chr14q32 |
url | https://www.mdpi.com/2072-6694/14/5/1150 |
work_keys_str_mv | AT filippovalle multiomicstopicmodelingforbreastcancerclassification AT matteoosella multiomicstopicmodelingforbreastcancerclassification AT michelecaselle multiomicstopicmodelingforbreastcancerclassification |