Synergistic Effects of Different Levels of Genomic Data for the Staging of Lung Adenocarcinoma: An Illustrative Study

Lung adenocarcinoma (LUAD) is a common and very lethal cancer. Accurate staging is a prerequisite for its effective diagnosis and treatment. Therefore, improving the accuracy of the stage prediction of LUAD patients is of great clinical relevance. Previous works have mainly focused on single genomic...

Full description

Bibliographic Details
Main Authors: Yingxia Li, Ulrich Mansmann, Shangming Du, Roman Hornung
Format: Article
Language:English
Published: MDPI AG 2021-11-01
Series:Genes
Subjects:
Online Access:https://www.mdpi.com/2073-4425/12/12/1872
_version_ 1797504303656599552
author Yingxia Li
Ulrich Mansmann
Shangming Du
Roman Hornung
author_facet Yingxia Li
Ulrich Mansmann
Shangming Du
Roman Hornung
author_sort Yingxia Li
collection DOAJ
description Lung adenocarcinoma (LUAD) is a common and very lethal cancer. Accurate staging is a prerequisite for its effective diagnosis and treatment. Therefore, improving the accuracy of the stage prediction of LUAD patients is of great clinical relevance. Previous works have mainly focused on single genomic data information or a small number of different omics data types concurrently for generating predictive models. A few of them have considered multi-omics data from genome to proteome. We used a publicly available dataset to illustrate the potential of multi-omics data for stage prediction in LUAD. In particular, we investigated the roles of the specific omics data types in the prediction process. We used a self-developed method, Omics-MKL, for stage prediction that combines an existing feature ranking technique Minimum Redundancy and Maximum Relevance (mRMR), which avoids redundancy among the selected features, and multiple kernel learning (MKL), applying different kernels for different omics data types. Each of the considered omics data types individually provided useful prediction results. Moreover, using multi-omics data delivered notably better results than using single-omics data. Gene expression and methylation information seem to play vital roles in the staging of LUAD. The Omics-MKL method retained 70 features after the selection process. Of these, 21 (30%) were methylation features and 34 (48.57%) were gene expression features. Moreover, 18 (25.71%) of the selected features are known to be related to LUAD, and 29 (41.43%) to lung cancer in general. Using multi-omics data from genome to proteome for predicting the stage of LUAD seems promising because each omics data type may improve the accuracy of the predictions. Here, methylation and gene expression data may play particularly important roles.
first_indexed 2024-03-10T04:02:43Z
format Article
id doaj.art-ba66b56547674b6fab8b6e37dafb4f59
institution Directory Open Access Journal
issn 2073-4425
language English
last_indexed 2024-03-10T04:02:43Z
publishDate 2021-11-01
publisher MDPI AG
record_format Article
series Genes
spelling doaj.art-ba66b56547674b6fab8b6e37dafb4f592023-11-23T08:29:41ZengMDPI AGGenes2073-44252021-11-011212187210.3390/genes12121872Synergistic Effects of Different Levels of Genomic Data for the Staging of Lung Adenocarcinoma: An Illustrative StudyYingxia Li0Ulrich Mansmann1Shangming Du2Roman Hornung3Institute of Medical Informatics, Biometry and Epidemiology, University of Munich, 81377 Munich, GermanyInstitute of Medical Informatics, Biometry and Epidemiology, University of Munich, 81377 Munich, GermanyInstitute of Medical Informatics, Biometry and Epidemiology, University of Munich, 81377 Munich, GermanyInstitute of Medical Informatics, Biometry and Epidemiology, University of Munich, 81377 Munich, GermanyLung adenocarcinoma (LUAD) is a common and very lethal cancer. Accurate staging is a prerequisite for its effective diagnosis and treatment. Therefore, improving the accuracy of the stage prediction of LUAD patients is of great clinical relevance. Previous works have mainly focused on single genomic data information or a small number of different omics data types concurrently for generating predictive models. A few of them have considered multi-omics data from genome to proteome. We used a publicly available dataset to illustrate the potential of multi-omics data for stage prediction in LUAD. In particular, we investigated the roles of the specific omics data types in the prediction process. We used a self-developed method, Omics-MKL, for stage prediction that combines an existing feature ranking technique Minimum Redundancy and Maximum Relevance (mRMR), which avoids redundancy among the selected features, and multiple kernel learning (MKL), applying different kernels for different omics data types. Each of the considered omics data types individually provided useful prediction results. Moreover, using multi-omics data delivered notably better results than using single-omics data. Gene expression and methylation information seem to play vital roles in the staging of LUAD. The Omics-MKL method retained 70 features after the selection process. Of these, 21 (30%) were methylation features and 34 (48.57%) were gene expression features. Moreover, 18 (25.71%) of the selected features are known to be related to LUAD, and 29 (41.43%) to lung cancer in general. Using multi-omics data from genome to proteome for predicting the stage of LUAD seems promising because each omics data type may improve the accuracy of the predictions. Here, methylation and gene expression data may play particularly important roles.https://www.mdpi.com/2073-4425/12/12/1872multi-omics datalung adenocarcinomaMKLmRMR
spellingShingle Yingxia Li
Ulrich Mansmann
Shangming Du
Roman Hornung
Synergistic Effects of Different Levels of Genomic Data for the Staging of Lung Adenocarcinoma: An Illustrative Study
Genes
multi-omics data
lung adenocarcinoma
MKL
mRMR
title Synergistic Effects of Different Levels of Genomic Data for the Staging of Lung Adenocarcinoma: An Illustrative Study
title_full Synergistic Effects of Different Levels of Genomic Data for the Staging of Lung Adenocarcinoma: An Illustrative Study
title_fullStr Synergistic Effects of Different Levels of Genomic Data for the Staging of Lung Adenocarcinoma: An Illustrative Study
title_full_unstemmed Synergistic Effects of Different Levels of Genomic Data for the Staging of Lung Adenocarcinoma: An Illustrative Study
title_short Synergistic Effects of Different Levels of Genomic Data for the Staging of Lung Adenocarcinoma: An Illustrative Study
title_sort synergistic effects of different levels of genomic data for the staging of lung adenocarcinoma an illustrative study
topic multi-omics data
lung adenocarcinoma
MKL
mRMR
url https://www.mdpi.com/2073-4425/12/12/1872
work_keys_str_mv AT yingxiali synergisticeffectsofdifferentlevelsofgenomicdataforthestagingoflungadenocarcinomaanillustrativestudy
AT ulrichmansmann synergisticeffectsofdifferentlevelsofgenomicdataforthestagingoflungadenocarcinomaanillustrativestudy
AT shangmingdu synergisticeffectsofdifferentlevelsofgenomicdataforthestagingoflungadenocarcinomaanillustrativestudy
AT romanhornung synergisticeffectsofdifferentlevelsofgenomicdataforthestagingoflungadenocarcinomaanillustrativestudy