A benchmark study of deep learning-based multi-omics data fusion methods for cancer
Abstract Background A fused method using a combination of multi-omics data enables a comprehensive study of complex biological processes and highlights the interrelationship of relevant biomolecules and their functions. Driven by high-throughput sequencing technologies, several promising deep learni...
Main Authors: | , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2022-08-01
|
Series: | Genome Biology |
Online Access: | https://doi.org/10.1186/s13059-022-02739-2 |
_version_ | 1811320965525667840 |
---|---|
author | Dongjin Leng Linyi Zheng Yuqi Wen Yunhao Zhang Lianlian Wu Jing Wang Meihong Wang Zhongnan Zhang Song He Xiaochen Bo |
author_facet | Dongjin Leng Linyi Zheng Yuqi Wen Yunhao Zhang Lianlian Wu Jing Wang Meihong Wang Zhongnan Zhang Song He Xiaochen Bo |
author_sort | Dongjin Leng |
collection | DOAJ |
description | Abstract Background A fused method using a combination of multi-omics data enables a comprehensive study of complex biological processes and highlights the interrelationship of relevant biomolecules and their functions. Driven by high-throughput sequencing technologies, several promising deep learning methods have been proposed for fusing multi-omics data generated from a large number of samples. Results In this study, 16 representative deep learning methods are comprehensively evaluated on simulated, single-cell, and cancer multi-omics datasets. For each of the datasets, two tasks are designed: classification and clustering. The classification performance is evaluated by using three benchmarking metrics including accuracy, F1 macro, and F1 weighted. Meanwhile, the clustering performance is evaluated by using four benchmarking metrics including the Jaccard index (JI), C-index, silhouette score, and Davies Bouldin score. For the cancer multi-omics datasets, the methods’ strength in capturing the association of multi-omics dimensionality reduction results with survival and clinical annotations is further evaluated. The benchmarking results indicate that moGAT achieves the best classification performance. Meanwhile, efmmdVAE, efVAE, and lfmmdVAE show the most promising performance across all complementary contexts in clustering tasks. Conclusions Our benchmarking results not only provide a reference for biomedical researchers to choose appropriate deep learning-based multi-omics data fusion methods, but also suggest the future directions for the development of more effective multi-omics data fusion methods. The deep learning frameworks are available at https://github.com/zhenglinyi/DL-mo . |
first_indexed | 2024-04-13T13:08:48Z |
format | Article |
id | doaj.art-c4b401cfa1384a05ac0e23f4d9ea52f0 |
institution | Directory Open Access Journal |
issn | 1474-760X |
language | English |
last_indexed | 2024-04-13T13:08:48Z |
publishDate | 2022-08-01 |
publisher | BMC |
record_format | Article |
series | Genome Biology |
spelling | doaj.art-c4b401cfa1384a05ac0e23f4d9ea52f02022-12-22T02:45:40ZengBMCGenome Biology1474-760X2022-08-0123113210.1186/s13059-022-02739-2A benchmark study of deep learning-based multi-omics data fusion methods for cancerDongjin Leng0Linyi Zheng1Yuqi Wen2Yunhao Zhang3Lianlian Wu4Jing Wang5Meihong Wang6Zhongnan Zhang7Song He8Xiaochen Bo9Institute of Health Service and Transfusion MedicineSchool of Informatics, Xiamen UniversityInstitute of Health Service and Transfusion MedicineSchool of Informatics, Xiamen UniversityAcademy of Medical Engineering and Translational Medicine, Tianjin UniversitySchool of Medicine, Tsinghua UniversitySchool of Informatics, Xiamen UniversitySchool of Informatics, Xiamen UniversityInstitute of Health Service and Transfusion MedicineInstitute of Health Service and Transfusion MedicineAbstract Background A fused method using a combination of multi-omics data enables a comprehensive study of complex biological processes and highlights the interrelationship of relevant biomolecules and their functions. Driven by high-throughput sequencing technologies, several promising deep learning methods have been proposed for fusing multi-omics data generated from a large number of samples. Results In this study, 16 representative deep learning methods are comprehensively evaluated on simulated, single-cell, and cancer multi-omics datasets. For each of the datasets, two tasks are designed: classification and clustering. The classification performance is evaluated by using three benchmarking metrics including accuracy, F1 macro, and F1 weighted. Meanwhile, the clustering performance is evaluated by using four benchmarking metrics including the Jaccard index (JI), C-index, silhouette score, and Davies Bouldin score. For the cancer multi-omics datasets, the methods’ strength in capturing the association of multi-omics dimensionality reduction results with survival and clinical annotations is further evaluated. The benchmarking results indicate that moGAT achieves the best classification performance. Meanwhile, efmmdVAE, efVAE, and lfmmdVAE show the most promising performance across all complementary contexts in clustering tasks. Conclusions Our benchmarking results not only provide a reference for biomedical researchers to choose appropriate deep learning-based multi-omics data fusion methods, but also suggest the future directions for the development of more effective multi-omics data fusion methods. The deep learning frameworks are available at https://github.com/zhenglinyi/DL-mo .https://doi.org/10.1186/s13059-022-02739-2 |
spellingShingle | Dongjin Leng Linyi Zheng Yuqi Wen Yunhao Zhang Lianlian Wu Jing Wang Meihong Wang Zhongnan Zhang Song He Xiaochen Bo A benchmark study of deep learning-based multi-omics data fusion methods for cancer Genome Biology |
title | A benchmark study of deep learning-based multi-omics data fusion methods for cancer |
title_full | A benchmark study of deep learning-based multi-omics data fusion methods for cancer |
title_fullStr | A benchmark study of deep learning-based multi-omics data fusion methods for cancer |
title_full_unstemmed | A benchmark study of deep learning-based multi-omics data fusion methods for cancer |
title_short | A benchmark study of deep learning-based multi-omics data fusion methods for cancer |
title_sort | benchmark study of deep learning based multi omics data fusion methods for cancer |
url | https://doi.org/10.1186/s13059-022-02739-2 |
work_keys_str_mv | AT dongjinleng abenchmarkstudyofdeeplearningbasedmultiomicsdatafusionmethodsforcancer AT linyizheng abenchmarkstudyofdeeplearningbasedmultiomicsdatafusionmethodsforcancer AT yuqiwen abenchmarkstudyofdeeplearningbasedmultiomicsdatafusionmethodsforcancer AT yunhaozhang abenchmarkstudyofdeeplearningbasedmultiomicsdatafusionmethodsforcancer AT lianlianwu abenchmarkstudyofdeeplearningbasedmultiomicsdatafusionmethodsforcancer AT jingwang abenchmarkstudyofdeeplearningbasedmultiomicsdatafusionmethodsforcancer AT meihongwang abenchmarkstudyofdeeplearningbasedmultiomicsdatafusionmethodsforcancer AT zhongnanzhang abenchmarkstudyofdeeplearningbasedmultiomicsdatafusionmethodsforcancer AT songhe abenchmarkstudyofdeeplearningbasedmultiomicsdatafusionmethodsforcancer AT xiaochenbo abenchmarkstudyofdeeplearningbasedmultiomicsdatafusionmethodsforcancer AT dongjinleng benchmarkstudyofdeeplearningbasedmultiomicsdatafusionmethodsforcancer AT linyizheng benchmarkstudyofdeeplearningbasedmultiomicsdatafusionmethodsforcancer AT yuqiwen benchmarkstudyofdeeplearningbasedmultiomicsdatafusionmethodsforcancer AT yunhaozhang benchmarkstudyofdeeplearningbasedmultiomicsdatafusionmethodsforcancer AT lianlianwu benchmarkstudyofdeeplearningbasedmultiomicsdatafusionmethodsforcancer AT jingwang benchmarkstudyofdeeplearningbasedmultiomicsdatafusionmethodsforcancer AT meihongwang benchmarkstudyofdeeplearningbasedmultiomicsdatafusionmethodsforcancer AT zhongnanzhang benchmarkstudyofdeeplearningbasedmultiomicsdatafusionmethodsforcancer AT songhe benchmarkstudyofdeeplearningbasedmultiomicsdatafusionmethodsforcancer AT xiaochenbo benchmarkstudyofdeeplearningbasedmultiomicsdatafusionmethodsforcancer |