An explainable machine learning-driven proposal of pulmonary fibrosis biomarkers
Pulmonary fibrosing diseases are in the very epicenter of biomedical research both due to their increasing prevalence and their association with SARS-CoV-2 infections. Research of idiopathic pulmonary fibrosis, the most lethal among the interstitial lung diseases, is in need for new biomarkers and p...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2023-01-01
|
Series: | Computational and Structural Biotechnology Journal |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S2001037023001423 |
_version_ | 1797384070018105344 |
---|---|
author | Dionysios Fanidis Vasileios C. Pezoulas Dimitrios I. Fotiadis Vassilis Aidinis |
author_facet | Dionysios Fanidis Vasileios C. Pezoulas Dimitrios I. Fotiadis Vassilis Aidinis |
author_sort | Dionysios Fanidis |
collection | DOAJ |
description | Pulmonary fibrosing diseases are in the very epicenter of biomedical research both due to their increasing prevalence and their association with SARS-CoV-2 infections. Research of idiopathic pulmonary fibrosis, the most lethal among the interstitial lung diseases, is in need for new biomarkers and potential disease targets, a goal that could be accelerated using machine learning techniques. In this study, we have used Shapley values to explain the decisions made by an ensemble learning model trained to classify samples to an either pulmonary fibrosis or steady state based on the expression values of deregulated genes. This process resulted in a full and a laconic set of features capable of separating phenotypes to an at least equal degree as previously published marker sets. Indicatively, a maximum increase of 6% in specificity and 5% in Mathew’s correlation coefficient was achieved. Evaluation with an additional independent dataset showed our feature set having a greater generalization potential than the rest. Ultimately, the proposed gene lists are expected not only to serve as new sets of diagnostic marker elements, but also as a target pool for future research initiatives. |
first_indexed | 2024-03-08T21:31:16Z |
format | Article |
id | doaj.art-74225bb4bd5946ac8c19c4c7fdb3d10a |
institution | Directory Open Access Journal |
issn | 2001-0370 |
language | English |
last_indexed | 2024-03-08T21:31:16Z |
publishDate | 2023-01-01 |
publisher | Elsevier |
record_format | Article |
series | Computational and Structural Biotechnology Journal |
spelling | doaj.art-74225bb4bd5946ac8c19c4c7fdb3d10a2023-12-21T07:31:20ZengElsevierComputational and Structural Biotechnology Journal2001-03702023-01-012123052315An explainable machine learning-driven proposal of pulmonary fibrosis biomarkersDionysios Fanidis0Vasileios C. Pezoulas1Dimitrios I. Fotiadis2Vassilis Aidinis3Institute for Fundamental Biomedical Research, BSRC Alexander Fleming, Vari GR16672, GreeceUnit of Medical Technology and Intelligent Information Systems, Department of Materials Science and Engineering, University of Ioannina, Ioannina GR45110, GreeceUnit of Medical Technology and Intelligent Information Systems, Department of Materials Science and Engineering, University of Ioannina, Ioannina GR45110, Greece; Biomedical Research Institute, FORTH, Ioannina GR45110, GreeceInstitute for Fundamental Biomedical Research, BSRC Alexander Fleming, Vari GR16672, Greece; Corresponding author.Pulmonary fibrosing diseases are in the very epicenter of biomedical research both due to their increasing prevalence and their association with SARS-CoV-2 infections. Research of idiopathic pulmonary fibrosis, the most lethal among the interstitial lung diseases, is in need for new biomarkers and potential disease targets, a goal that could be accelerated using machine learning techniques. In this study, we have used Shapley values to explain the decisions made by an ensemble learning model trained to classify samples to an either pulmonary fibrosis or steady state based on the expression values of deregulated genes. This process resulted in a full and a laconic set of features capable of separating phenotypes to an at least equal degree as previously published marker sets. Indicatively, a maximum increase of 6% in specificity and 5% in Mathew’s correlation coefficient was achieved. Evaluation with an additional independent dataset showed our feature set having a greater generalization potential than the rest. Ultimately, the proposed gene lists are expected not only to serve as new sets of diagnostic marker elements, but also as a target pool for future research initiatives.http://www.sciencedirect.com/science/article/pii/S2001037023001423Idiopathic pulmonary fibrosis (IPF)Machine learningDiagnostic biomarkersOmics data |
spellingShingle | Dionysios Fanidis Vasileios C. Pezoulas Dimitrios I. Fotiadis Vassilis Aidinis An explainable machine learning-driven proposal of pulmonary fibrosis biomarkers Computational and Structural Biotechnology Journal Idiopathic pulmonary fibrosis (IPF) Machine learning Diagnostic biomarkers Omics data |
title | An explainable machine learning-driven proposal of pulmonary fibrosis biomarkers |
title_full | An explainable machine learning-driven proposal of pulmonary fibrosis biomarkers |
title_fullStr | An explainable machine learning-driven proposal of pulmonary fibrosis biomarkers |
title_full_unstemmed | An explainable machine learning-driven proposal of pulmonary fibrosis biomarkers |
title_short | An explainable machine learning-driven proposal of pulmonary fibrosis biomarkers |
title_sort | explainable machine learning driven proposal of pulmonary fibrosis biomarkers |
topic | Idiopathic pulmonary fibrosis (IPF) Machine learning Diagnostic biomarkers Omics data |
url | http://www.sciencedirect.com/science/article/pii/S2001037023001423 |
work_keys_str_mv | AT dionysiosfanidis anexplainablemachinelearningdrivenproposalofpulmonaryfibrosisbiomarkers AT vasileioscpezoulas anexplainablemachinelearningdrivenproposalofpulmonaryfibrosisbiomarkers AT dimitriosifotiadis anexplainablemachinelearningdrivenproposalofpulmonaryfibrosisbiomarkers AT vassilisaidinis anexplainablemachinelearningdrivenproposalofpulmonaryfibrosisbiomarkers AT dionysiosfanidis explainablemachinelearningdrivenproposalofpulmonaryfibrosisbiomarkers AT vasileioscpezoulas explainablemachinelearningdrivenproposalofpulmonaryfibrosisbiomarkers AT dimitriosifotiadis explainablemachinelearningdrivenproposalofpulmonaryfibrosisbiomarkers AT vassilisaidinis explainablemachinelearningdrivenproposalofpulmonaryfibrosisbiomarkers |