Statistical integration of multi-omics and drug screening data from cell lines.
Data integration methods are used to obtain a unified summary of multiple datasets. For multi-modal data, we propose a computational workflow to jointly analyze datasets from cell lines. The workflow comprises a novel probabilistic data integration method, named POPLS-DA, for multi-omics data. The w...
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Public Library of Science (PLoS)
2024-01-01
|
Series: | PLoS Computational Biology |
Online Access: | https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1011809&type=printable |
_version_ | 1797296236499304448 |
---|---|
author | Said El Bouhaddani Matthias Höllerhage Hae-Won Uh Claudia Moebius Marc Bickle Günter Höglinger Jeanine Houwing-Duistermaat |
author_facet | Said El Bouhaddani Matthias Höllerhage Hae-Won Uh Claudia Moebius Marc Bickle Günter Höglinger Jeanine Houwing-Duistermaat |
author_sort | Said El Bouhaddani |
collection | DOAJ |
description | Data integration methods are used to obtain a unified summary of multiple datasets. For multi-modal data, we propose a computational workflow to jointly analyze datasets from cell lines. The workflow comprises a novel probabilistic data integration method, named POPLS-DA, for multi-omics data. The workflow is motivated by a study on synucleinopathies where transcriptomics, proteomics, and drug screening data are measured in affected LUHMES cell lines and controls. The aim is to highlight potentially druggable pathways and genes involved in synucleinopathies. First, POPLS-DA is used to prioritize genes and proteins that best distinguish cases and controls. For these genes, an integrated interaction network is constructed where the drug screen data is incorporated to highlight druggable genes and pathways in the network. Finally, functional enrichment analyses are performed to identify clusters of synaptic and lysosome-related genes and proteins targeted by the protective drugs. POPLS-DA is compared to other single- and multi-omics approaches. We found that HSPA5, a member of the heat shock protein 70 family, was one of the most targeted genes by the validated drugs, in particular by AT1-blockers. HSPA5 and AT1-blockers have been previously linked to α-synuclein pathology and Parkinson's disease, showing the relevance of our findings. Our computational workflow identified new directions for therapeutic targets for synucleinopathies. POPLS-DA provided a larger interpretable gene set than other single- and multi-omic approaches. An implementation based on R and markdown is freely available online. |
first_indexed | 2024-03-07T22:00:30Z |
format | Article |
id | doaj.art-d1b5e0f2662c41419ef3577fecd9581c |
institution | Directory Open Access Journal |
issn | 1553-734X 1553-7358 |
language | English |
last_indexed | 2024-03-07T22:00:30Z |
publishDate | 2024-01-01 |
publisher | Public Library of Science (PLoS) |
record_format | Article |
series | PLoS Computational Biology |
spelling | doaj.art-d1b5e0f2662c41419ef3577fecd9581c2024-02-24T05:31:22ZengPublic Library of Science (PLoS)PLoS Computational Biology1553-734X1553-73582024-01-01201e101180910.1371/journal.pcbi.1011809Statistical integration of multi-omics and drug screening data from cell lines.Said El BouhaddaniMatthias HöllerhageHae-Won UhClaudia MoebiusMarc BickleGünter HöglingerJeanine Houwing-DuistermaatData integration methods are used to obtain a unified summary of multiple datasets. For multi-modal data, we propose a computational workflow to jointly analyze datasets from cell lines. The workflow comprises a novel probabilistic data integration method, named POPLS-DA, for multi-omics data. The workflow is motivated by a study on synucleinopathies where transcriptomics, proteomics, and drug screening data are measured in affected LUHMES cell lines and controls. The aim is to highlight potentially druggable pathways and genes involved in synucleinopathies. First, POPLS-DA is used to prioritize genes and proteins that best distinguish cases and controls. For these genes, an integrated interaction network is constructed where the drug screen data is incorporated to highlight druggable genes and pathways in the network. Finally, functional enrichment analyses are performed to identify clusters of synaptic and lysosome-related genes and proteins targeted by the protective drugs. POPLS-DA is compared to other single- and multi-omics approaches. We found that HSPA5, a member of the heat shock protein 70 family, was one of the most targeted genes by the validated drugs, in particular by AT1-blockers. HSPA5 and AT1-blockers have been previously linked to α-synuclein pathology and Parkinson's disease, showing the relevance of our findings. Our computational workflow identified new directions for therapeutic targets for synucleinopathies. POPLS-DA provided a larger interpretable gene set than other single- and multi-omic approaches. An implementation based on R and markdown is freely available online.https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1011809&type=printable |
spellingShingle | Said El Bouhaddani Matthias Höllerhage Hae-Won Uh Claudia Moebius Marc Bickle Günter Höglinger Jeanine Houwing-Duistermaat Statistical integration of multi-omics and drug screening data from cell lines. PLoS Computational Biology |
title | Statistical integration of multi-omics and drug screening data from cell lines. |
title_full | Statistical integration of multi-omics and drug screening data from cell lines. |
title_fullStr | Statistical integration of multi-omics and drug screening data from cell lines. |
title_full_unstemmed | Statistical integration of multi-omics and drug screening data from cell lines. |
title_short | Statistical integration of multi-omics and drug screening data from cell lines. |
title_sort | statistical integration of multi omics and drug screening data from cell lines |
url | https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1011809&type=printable |
work_keys_str_mv | AT saidelbouhaddani statisticalintegrationofmultiomicsanddrugscreeningdatafromcelllines AT matthiashollerhage statisticalintegrationofmultiomicsanddrugscreeningdatafromcelllines AT haewonuh statisticalintegrationofmultiomicsanddrugscreeningdatafromcelllines AT claudiamoebius statisticalintegrationofmultiomicsanddrugscreeningdatafromcelllines AT marcbickle statisticalintegrationofmultiomicsanddrugscreeningdatafromcelllines AT gunterhoglinger statisticalintegrationofmultiomicsanddrugscreeningdatafromcelllines AT jeaninehouwingduistermaat statisticalintegrationofmultiomicsanddrugscreeningdatafromcelllines |