A benchmark study of deep learning-based multi-omics data fusion methods for cancer

Abstract Background A fused method using a combination of multi-omics data enables a comprehensive study of complex biological processes and highlights the interrelationship of relevant biomolecules and their functions. Driven by high-throughput sequencing technologies, several promising deep learni...

Full description

Bibliographic Details
Main Authors: Dongjin Leng, Linyi Zheng, Yuqi Wen, Yunhao Zhang, Lianlian Wu, Jing Wang, Meihong Wang, Zhongnan Zhang, Song He, Xiaochen Bo
Format: Article
Language:English
Published: BMC 2022-08-01
Series:Genome Biology
Online Access:https://doi.org/10.1186/s13059-022-02739-2
_version_ 1811320965525667840
author Dongjin Leng
Linyi Zheng
Yuqi Wen
Yunhao Zhang
Lianlian Wu
Jing Wang
Meihong Wang
Zhongnan Zhang
Song He
Xiaochen Bo
author_facet Dongjin Leng
Linyi Zheng
Yuqi Wen
Yunhao Zhang
Lianlian Wu
Jing Wang
Meihong Wang
Zhongnan Zhang
Song He
Xiaochen Bo
author_sort Dongjin Leng
collection DOAJ
description Abstract Background A fused method using a combination of multi-omics data enables a comprehensive study of complex biological processes and highlights the interrelationship of relevant biomolecules and their functions. Driven by high-throughput sequencing technologies, several promising deep learning methods have been proposed for fusing multi-omics data generated from a large number of samples. Results In this study, 16 representative deep learning methods are comprehensively evaluated on simulated, single-cell, and cancer multi-omics datasets. For each of the datasets, two tasks are designed: classification and clustering. The classification performance is evaluated by using three benchmarking metrics including accuracy, F1 macro, and F1 weighted. Meanwhile, the clustering performance is evaluated by using four benchmarking metrics including the Jaccard index (JI), C-index, silhouette score, and Davies Bouldin score. For the cancer multi-omics datasets, the methods’ strength in capturing the association of multi-omics dimensionality reduction results with survival and clinical annotations is further evaluated. The benchmarking results indicate that moGAT achieves the best classification performance. Meanwhile, efmmdVAE, efVAE, and lfmmdVAE show the most promising performance across all complementary contexts in clustering tasks. Conclusions Our benchmarking results not only provide a reference for biomedical researchers to choose appropriate deep learning-based multi-omics data fusion methods, but also suggest the future directions for the development of more effective multi-omics data fusion methods. The deep learning frameworks are available at https://github.com/zhenglinyi/DL-mo .
first_indexed 2024-04-13T13:08:48Z
format Article
id doaj.art-c4b401cfa1384a05ac0e23f4d9ea52f0
institution Directory Open Access Journal
issn 1474-760X
language English
last_indexed 2024-04-13T13:08:48Z
publishDate 2022-08-01
publisher BMC
record_format Article
series Genome Biology
spelling doaj.art-c4b401cfa1384a05ac0e23f4d9ea52f02022-12-22T02:45:40ZengBMCGenome Biology1474-760X2022-08-0123113210.1186/s13059-022-02739-2A benchmark study of deep learning-based multi-omics data fusion methods for cancerDongjin Leng0Linyi Zheng1Yuqi Wen2Yunhao Zhang3Lianlian Wu4Jing Wang5Meihong Wang6Zhongnan Zhang7Song He8Xiaochen Bo9Institute of Health Service and Transfusion MedicineSchool of Informatics, Xiamen UniversityInstitute of Health Service and Transfusion MedicineSchool of Informatics, Xiamen UniversityAcademy of Medical Engineering and Translational Medicine, Tianjin UniversitySchool of Medicine, Tsinghua UniversitySchool of Informatics, Xiamen UniversitySchool of Informatics, Xiamen UniversityInstitute of Health Service and Transfusion MedicineInstitute of Health Service and Transfusion MedicineAbstract Background A fused method using a combination of multi-omics data enables a comprehensive study of complex biological processes and highlights the interrelationship of relevant biomolecules and their functions. Driven by high-throughput sequencing technologies, several promising deep learning methods have been proposed for fusing multi-omics data generated from a large number of samples. Results In this study, 16 representative deep learning methods are comprehensively evaluated on simulated, single-cell, and cancer multi-omics datasets. For each of the datasets, two tasks are designed: classification and clustering. The classification performance is evaluated by using three benchmarking metrics including accuracy, F1 macro, and F1 weighted. Meanwhile, the clustering performance is evaluated by using four benchmarking metrics including the Jaccard index (JI), C-index, silhouette score, and Davies Bouldin score. For the cancer multi-omics datasets, the methods’ strength in capturing the association of multi-omics dimensionality reduction results with survival and clinical annotations is further evaluated. The benchmarking results indicate that moGAT achieves the best classification performance. Meanwhile, efmmdVAE, efVAE, and lfmmdVAE show the most promising performance across all complementary contexts in clustering tasks. Conclusions Our benchmarking results not only provide a reference for biomedical researchers to choose appropriate deep learning-based multi-omics data fusion methods, but also suggest the future directions for the development of more effective multi-omics data fusion methods. The deep learning frameworks are available at https://github.com/zhenglinyi/DL-mo .https://doi.org/10.1186/s13059-022-02739-2
spellingShingle Dongjin Leng
Linyi Zheng
Yuqi Wen
Yunhao Zhang
Lianlian Wu
Jing Wang
Meihong Wang
Zhongnan Zhang
Song He
Xiaochen Bo
A benchmark study of deep learning-based multi-omics data fusion methods for cancer
Genome Biology
title A benchmark study of deep learning-based multi-omics data fusion methods for cancer
title_full A benchmark study of deep learning-based multi-omics data fusion methods for cancer
title_fullStr A benchmark study of deep learning-based multi-omics data fusion methods for cancer
title_full_unstemmed A benchmark study of deep learning-based multi-omics data fusion methods for cancer
title_short A benchmark study of deep learning-based multi-omics data fusion methods for cancer
title_sort benchmark study of deep learning based multi omics data fusion methods for cancer
url https://doi.org/10.1186/s13059-022-02739-2
work_keys_str_mv AT dongjinleng abenchmarkstudyofdeeplearningbasedmultiomicsdatafusionmethodsforcancer
AT linyizheng abenchmarkstudyofdeeplearningbasedmultiomicsdatafusionmethodsforcancer
AT yuqiwen abenchmarkstudyofdeeplearningbasedmultiomicsdatafusionmethodsforcancer
AT yunhaozhang abenchmarkstudyofdeeplearningbasedmultiomicsdatafusionmethodsforcancer
AT lianlianwu abenchmarkstudyofdeeplearningbasedmultiomicsdatafusionmethodsforcancer
AT jingwang abenchmarkstudyofdeeplearningbasedmultiomicsdatafusionmethodsforcancer
AT meihongwang abenchmarkstudyofdeeplearningbasedmultiomicsdatafusionmethodsforcancer
AT zhongnanzhang abenchmarkstudyofdeeplearningbasedmultiomicsdatafusionmethodsforcancer
AT songhe abenchmarkstudyofdeeplearningbasedmultiomicsdatafusionmethodsforcancer
AT xiaochenbo abenchmarkstudyofdeeplearningbasedmultiomicsdatafusionmethodsforcancer
AT dongjinleng benchmarkstudyofdeeplearningbasedmultiomicsdatafusionmethodsforcancer
AT linyizheng benchmarkstudyofdeeplearningbasedmultiomicsdatafusionmethodsforcancer
AT yuqiwen benchmarkstudyofdeeplearningbasedmultiomicsdatafusionmethodsforcancer
AT yunhaozhang benchmarkstudyofdeeplearningbasedmultiomicsdatafusionmethodsforcancer
AT lianlianwu benchmarkstudyofdeeplearningbasedmultiomicsdatafusionmethodsforcancer
AT jingwang benchmarkstudyofdeeplearningbasedmultiomicsdatafusionmethodsforcancer
AT meihongwang benchmarkstudyofdeeplearningbasedmultiomicsdatafusionmethodsforcancer
AT zhongnanzhang benchmarkstudyofdeeplearningbasedmultiomicsdatafusionmethodsforcancer
AT songhe benchmarkstudyofdeeplearningbasedmultiomicsdatafusionmethodsforcancer
AT xiaochenbo benchmarkstudyofdeeplearningbasedmultiomicsdatafusionmethodsforcancer