An explainable machine learning-driven proposal of pulmonary fibrosis biomarkers

Pulmonary fibrosing diseases are in the very epicenter of biomedical research both due to their increasing prevalence and their association with SARS-CoV-2 infections. Research of idiopathic pulmonary fibrosis, the most lethal among the interstitial lung diseases, is in need for new biomarkers and p...

Full description

Bibliographic Details
Main Authors: Dionysios Fanidis, Vasileios C. Pezoulas, Dimitrios I. Fotiadis, Vassilis Aidinis
Format: Article
Language:English
Published: Elsevier 2023-01-01
Series:Computational and Structural Biotechnology Journal
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2001037023001423
_version_ 1797384070018105344
author Dionysios Fanidis
Vasileios C. Pezoulas
Dimitrios I. Fotiadis
Vassilis Aidinis
author_facet Dionysios Fanidis
Vasileios C. Pezoulas
Dimitrios I. Fotiadis
Vassilis Aidinis
author_sort Dionysios Fanidis
collection DOAJ
description Pulmonary fibrosing diseases are in the very epicenter of biomedical research both due to their increasing prevalence and their association with SARS-CoV-2 infections. Research of idiopathic pulmonary fibrosis, the most lethal among the interstitial lung diseases, is in need for new biomarkers and potential disease targets, a goal that could be accelerated using machine learning techniques. In this study, we have used Shapley values to explain the decisions made by an ensemble learning model trained to classify samples to an either pulmonary fibrosis or steady state based on the expression values of deregulated genes. This process resulted in a full and a laconic set of features capable of separating phenotypes to an at least equal degree as previously published marker sets. Indicatively, a maximum increase of 6% in specificity and 5% in Mathew’s correlation coefficient was achieved. Evaluation with an additional independent dataset showed our feature set having a greater generalization potential than the rest. Ultimately, the proposed gene lists are expected not only to serve as new sets of diagnostic marker elements, but also as a target pool for future research initiatives.
first_indexed 2024-03-08T21:31:16Z
format Article
id doaj.art-74225bb4bd5946ac8c19c4c7fdb3d10a
institution Directory Open Access Journal
issn 2001-0370
language English
last_indexed 2024-03-08T21:31:16Z
publishDate 2023-01-01
publisher Elsevier
record_format Article
series Computational and Structural Biotechnology Journal
spelling doaj.art-74225bb4bd5946ac8c19c4c7fdb3d10a2023-12-21T07:31:20ZengElsevierComputational and Structural Biotechnology Journal2001-03702023-01-012123052315An explainable machine learning-driven proposal of pulmonary fibrosis biomarkersDionysios Fanidis0Vasileios C. Pezoulas1Dimitrios I. Fotiadis2Vassilis Aidinis3Institute for Fundamental Biomedical Research, BSRC Alexander Fleming, Vari GR16672, GreeceUnit of Medical Technology and Intelligent Information Systems, Department of Materials Science and Engineering, University of Ioannina, Ioannina GR45110, GreeceUnit of Medical Technology and Intelligent Information Systems, Department of Materials Science and Engineering, University of Ioannina, Ioannina GR45110, Greece; Biomedical Research Institute, FORTH, Ioannina GR45110, GreeceInstitute for Fundamental Biomedical Research, BSRC Alexander Fleming, Vari GR16672, Greece; Corresponding author.Pulmonary fibrosing diseases are in the very epicenter of biomedical research both due to their increasing prevalence and their association with SARS-CoV-2 infections. Research of idiopathic pulmonary fibrosis, the most lethal among the interstitial lung diseases, is in need for new biomarkers and potential disease targets, a goal that could be accelerated using machine learning techniques. In this study, we have used Shapley values to explain the decisions made by an ensemble learning model trained to classify samples to an either pulmonary fibrosis or steady state based on the expression values of deregulated genes. This process resulted in a full and a laconic set of features capable of separating phenotypes to an at least equal degree as previously published marker sets. Indicatively, a maximum increase of 6% in specificity and 5% in Mathew’s correlation coefficient was achieved. Evaluation with an additional independent dataset showed our feature set having a greater generalization potential than the rest. Ultimately, the proposed gene lists are expected not only to serve as new sets of diagnostic marker elements, but also as a target pool for future research initiatives.http://www.sciencedirect.com/science/article/pii/S2001037023001423Idiopathic pulmonary fibrosis (IPF)Machine learningDiagnostic biomarkersOmics data
spellingShingle Dionysios Fanidis
Vasileios C. Pezoulas
Dimitrios I. Fotiadis
Vassilis Aidinis
An explainable machine learning-driven proposal of pulmonary fibrosis biomarkers
Computational and Structural Biotechnology Journal
Idiopathic pulmonary fibrosis (IPF)
Machine learning
Diagnostic biomarkers
Omics data
title An explainable machine learning-driven proposal of pulmonary fibrosis biomarkers
title_full An explainable machine learning-driven proposal of pulmonary fibrosis biomarkers
title_fullStr An explainable machine learning-driven proposal of pulmonary fibrosis biomarkers
title_full_unstemmed An explainable machine learning-driven proposal of pulmonary fibrosis biomarkers
title_short An explainable machine learning-driven proposal of pulmonary fibrosis biomarkers
title_sort explainable machine learning driven proposal of pulmonary fibrosis biomarkers
topic Idiopathic pulmonary fibrosis (IPF)
Machine learning
Diagnostic biomarkers
Omics data
url http://www.sciencedirect.com/science/article/pii/S2001037023001423
work_keys_str_mv AT dionysiosfanidis anexplainablemachinelearningdrivenproposalofpulmonaryfibrosisbiomarkers
AT vasileioscpezoulas anexplainablemachinelearningdrivenproposalofpulmonaryfibrosisbiomarkers
AT dimitriosifotiadis anexplainablemachinelearningdrivenproposalofpulmonaryfibrosisbiomarkers
AT vassilisaidinis anexplainablemachinelearningdrivenproposalofpulmonaryfibrosisbiomarkers
AT dionysiosfanidis explainablemachinelearningdrivenproposalofpulmonaryfibrosisbiomarkers
AT vasileioscpezoulas explainablemachinelearningdrivenproposalofpulmonaryfibrosisbiomarkers
AT dimitriosifotiadis explainablemachinelearningdrivenproposalofpulmonaryfibrosisbiomarkers
AT vassilisaidinis explainablemachinelearningdrivenproposalofpulmonaryfibrosisbiomarkers