GediNET for discovering gene associations across diseases using knowledge based machine learning approach
Abstract The most common approaches to discovering genes associated with specific diseases are based on machine learning and use a variety of feature selection techniques to identify significant genes that can serve as biomarkers for a given disease. More recently, the integration in this process of...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Nature Portfolio
2022-11-01
|
Series: | Scientific Reports |
Online Access: | https://doi.org/10.1038/s41598-022-24421-0 |
_version_ | 1811319590668468224 |
---|---|
author | Emma Qumsiyeh Louise Showe Malik Yousef |
author_facet | Emma Qumsiyeh Louise Showe Malik Yousef |
author_sort | Emma Qumsiyeh |
collection | DOAJ |
description | Abstract The most common approaches to discovering genes associated with specific diseases are based on machine learning and use a variety of feature selection techniques to identify significant genes that can serve as biomarkers for a given disease. More recently, the integration in this process of prior knowledge-based approaches has shown significant promise in the discovery of new biomarkers with potential translational applications. In this study, we developed a novel approach, GediNET, that integrates prior biological knowledge to gene Groups that are shown to be associated with a specific disease such as a cancer. The novelty of GediNET is that it then also allows the discovery of significant associations between that specific disease and other diseases. The initial step in this process involves the identification of gene Groups. The Groups are then subjected to a Scoring component to identify the top performing classification Groups. The top-ranked gene Groups are then used to train a Machine Learning Model. The process of Grouping, Scoring and Modelling (G-S-M) is used by GediNET to identify other diseases that are similarly associated with this signature. GediNET identifies these relationships through Disease–Disease Association (DDA) based machine learning. DDA explores novel associations between diseases and identifies relationships which could be used to further improve approaches to diagnosis, prognosis, and treatment. The GediNET KNIME workflow can be downloaded from: https://github.com/malikyousef/GediNET.git or https://kni.me/w/3kH1SQV_mMUsMTS . |
first_indexed | 2024-04-13T12:45:39Z |
format | Article |
id | doaj.art-7e0d99f912504d01965405b91206c9e5 |
institution | Directory Open Access Journal |
issn | 2045-2322 |
language | English |
last_indexed | 2024-04-13T12:45:39Z |
publishDate | 2022-11-01 |
publisher | Nature Portfolio |
record_format | Article |
series | Scientific Reports |
spelling | doaj.art-7e0d99f912504d01965405b91206c9e52022-12-22T02:46:22ZengNature PortfolioScientific Reports2045-23222022-11-0112111710.1038/s41598-022-24421-0GediNET for discovering gene associations across diseases using knowledge based machine learning approachEmma Qumsiyeh0Louise Showe1Malik Yousef2Information Technology Engineering, Al-Quds UniversityThe Wistar InstituteDepartment of Information Systems, Zefat Academic CollegeAbstract The most common approaches to discovering genes associated with specific diseases are based on machine learning and use a variety of feature selection techniques to identify significant genes that can serve as biomarkers for a given disease. More recently, the integration in this process of prior knowledge-based approaches has shown significant promise in the discovery of new biomarkers with potential translational applications. In this study, we developed a novel approach, GediNET, that integrates prior biological knowledge to gene Groups that are shown to be associated with a specific disease such as a cancer. The novelty of GediNET is that it then also allows the discovery of significant associations between that specific disease and other diseases. The initial step in this process involves the identification of gene Groups. The Groups are then subjected to a Scoring component to identify the top performing classification Groups. The top-ranked gene Groups are then used to train a Machine Learning Model. The process of Grouping, Scoring and Modelling (G-S-M) is used by GediNET to identify other diseases that are similarly associated with this signature. GediNET identifies these relationships through Disease–Disease Association (DDA) based machine learning. DDA explores novel associations between diseases and identifies relationships which could be used to further improve approaches to diagnosis, prognosis, and treatment. The GediNET KNIME workflow can be downloaded from: https://github.com/malikyousef/GediNET.git or https://kni.me/w/3kH1SQV_mMUsMTS .https://doi.org/10.1038/s41598-022-24421-0 |
spellingShingle | Emma Qumsiyeh Louise Showe Malik Yousef GediNET for discovering gene associations across diseases using knowledge based machine learning approach Scientific Reports |
title | GediNET for discovering gene associations across diseases using knowledge based machine learning approach |
title_full | GediNET for discovering gene associations across diseases using knowledge based machine learning approach |
title_fullStr | GediNET for discovering gene associations across diseases using knowledge based machine learning approach |
title_full_unstemmed | GediNET for discovering gene associations across diseases using knowledge based machine learning approach |
title_short | GediNET for discovering gene associations across diseases using knowledge based machine learning approach |
title_sort | gedinet for discovering gene associations across diseases using knowledge based machine learning approach |
url | https://doi.org/10.1038/s41598-022-24421-0 |
work_keys_str_mv | AT emmaqumsiyeh gedinetfordiscoveringgeneassociationsacrossdiseasesusingknowledgebasedmachinelearningapproach AT louiseshowe gedinetfordiscoveringgeneassociationsacrossdiseasesusingknowledgebasedmachinelearningapproach AT malikyousef gedinetfordiscoveringgeneassociationsacrossdiseasesusingknowledgebasedmachinelearningapproach |