An Entropy-Based Directed Random Walk for Cancer Classification Using Gene Expression Data Based on Bi-Random Walk on Two Separated Networks
The integration of microarray technologies and machine learning methods has become popular in predicting the pathological condition of diseases and discovering risk genes. Traditional microarray analysis considers pathways as a simple gene set, treating all genes in the pathway identically while ign...
Main Authors: | , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-02-01
|
Series: | Genes |
Subjects: | |
Online Access: | https://www.mdpi.com/2073-4425/14/3/574 |
_version_ | 1797611549143072768 |
---|---|
author | Xin Hui Tay Shahreen Kasim Tole Sutikno Mohd Farhan Md Fudzee Rohayanti Hassan Emelia Akashah Patah Akhir Norshakirah Aziz Choon Sen Seah |
author_facet | Xin Hui Tay Shahreen Kasim Tole Sutikno Mohd Farhan Md Fudzee Rohayanti Hassan Emelia Akashah Patah Akhir Norshakirah Aziz Choon Sen Seah |
author_sort | Xin Hui Tay |
collection | DOAJ |
description | The integration of microarray technologies and machine learning methods has become popular in predicting the pathological condition of diseases and discovering risk genes. Traditional microarray analysis considers pathways as a simple gene set, treating all genes in the pathway identically while ignoring the pathway network’s structure information. This study proposed an entropy-based directed random walk (e-DRW) method to infer pathway activities. Two enhancements from the conventional DRW were conducted, which are (1) to increase the coverage of human pathway information by constructing two inputting networks for pathway activity inference, and (2) to enhance the gene-weighting method in DRW by incorporating correlation coefficient values and <i>t</i>-test statistic scores. To test the objectives, gene expression datasets were used as input datasets while the pathway datasets were used as reference datasets to build two directed graphs. The within-dataset experiments indicated that e-DRW method demonstrated robust and superior performance in terms of classification accuracy and robustness of the predicted risk-active pathways compared to the other methods. In conclusion, the results revealed that e-DRW not only improved the prediction performance, but also effectively extracted topologically important pathways and genes that were specifically related to the corresponding cancer types. |
first_indexed | 2024-03-11T06:30:14Z |
format | Article |
id | doaj.art-9e7be410bdea4d53b684117302ce97d4 |
institution | Directory Open Access Journal |
issn | 2073-4425 |
language | English |
last_indexed | 2024-03-11T06:30:14Z |
publishDate | 2023-02-01 |
publisher | MDPI AG |
record_format | Article |
series | Genes |
spelling | doaj.art-9e7be410bdea4d53b684117302ce97d42023-11-17T11:16:27ZengMDPI AGGenes2073-44252023-02-0114357410.3390/genes14030574An Entropy-Based Directed Random Walk for Cancer Classification Using Gene Expression Data Based on Bi-Random Walk on Two Separated NetworksXin Hui Tay0Shahreen Kasim1Tole Sutikno2Mohd Farhan Md Fudzee3Rohayanti Hassan4Emelia Akashah Patah Akhir5Norshakirah Aziz6Choon Sen Seah7Faculty of Computer Sciences and Information Technology, Universiti Tun Hussein Onn Malaysia, Batu Pahat 83000, MalaysiaFaculty of Computer Sciences and Information Technology, Universiti Tun Hussein Onn Malaysia, Batu Pahat 83000, MalaysiaDepartment of Electrical Engineering, Universitas Ahmad Dahlan, Yogyakarta 55166, IndonesiaFaculty of Computer Sciences and Information Technology, Universiti Tun Hussein Onn Malaysia, Batu Pahat 83000, MalaysiaSchool of Computing, Faculty of Electrical Engineering, Universiti Teknologi Malaysia, Skudai 81310, MalaysiaDepartment of Computer and Information Sciences, Universiti Teknologi Petronas, Seri Iskandar 32610, MalaysiaDepartment of Computer and Information Sciences, Universiti Teknologi Petronas, Seri Iskandar 32610, MalaysiaFaculty of Accounting & Management, Universiti Tunku Abdul Rahman, Kajang 43000, MalaysiaThe integration of microarray technologies and machine learning methods has become popular in predicting the pathological condition of diseases and discovering risk genes. Traditional microarray analysis considers pathways as a simple gene set, treating all genes in the pathway identically while ignoring the pathway network’s structure information. This study proposed an entropy-based directed random walk (e-DRW) method to infer pathway activities. Two enhancements from the conventional DRW were conducted, which are (1) to increase the coverage of human pathway information by constructing two inputting networks for pathway activity inference, and (2) to enhance the gene-weighting method in DRW by incorporating correlation coefficient values and <i>t</i>-test statistic scores. To test the objectives, gene expression datasets were used as input datasets while the pathway datasets were used as reference datasets to build two directed graphs. The within-dataset experiments indicated that e-DRW method demonstrated robust and superior performance in terms of classification accuracy and robustness of the predicted risk-active pathways compared to the other methods. In conclusion, the results revealed that e-DRW not only improved the prediction performance, but also effectively extracted topologically important pathways and genes that were specifically related to the corresponding cancer types.https://www.mdpi.com/2073-4425/14/3/574directed random walkpathway-based analysiscancer classification |
spellingShingle | Xin Hui Tay Shahreen Kasim Tole Sutikno Mohd Farhan Md Fudzee Rohayanti Hassan Emelia Akashah Patah Akhir Norshakirah Aziz Choon Sen Seah An Entropy-Based Directed Random Walk for Cancer Classification Using Gene Expression Data Based on Bi-Random Walk on Two Separated Networks Genes directed random walk pathway-based analysis cancer classification |
title | An Entropy-Based Directed Random Walk for Cancer Classification Using Gene Expression Data Based on Bi-Random Walk on Two Separated Networks |
title_full | An Entropy-Based Directed Random Walk for Cancer Classification Using Gene Expression Data Based on Bi-Random Walk on Two Separated Networks |
title_fullStr | An Entropy-Based Directed Random Walk for Cancer Classification Using Gene Expression Data Based on Bi-Random Walk on Two Separated Networks |
title_full_unstemmed | An Entropy-Based Directed Random Walk for Cancer Classification Using Gene Expression Data Based on Bi-Random Walk on Two Separated Networks |
title_short | An Entropy-Based Directed Random Walk for Cancer Classification Using Gene Expression Data Based on Bi-Random Walk on Two Separated Networks |
title_sort | entropy based directed random walk for cancer classification using gene expression data based on bi random walk on two separated networks |
topic | directed random walk pathway-based analysis cancer classification |
url | https://www.mdpi.com/2073-4425/14/3/574 |
work_keys_str_mv | AT xinhuitay anentropybaseddirectedrandomwalkforcancerclassificationusinggeneexpressiondatabasedonbirandomwalkontwoseparatednetworks AT shahreenkasim anentropybaseddirectedrandomwalkforcancerclassificationusinggeneexpressiondatabasedonbirandomwalkontwoseparatednetworks AT tolesutikno anentropybaseddirectedrandomwalkforcancerclassificationusinggeneexpressiondatabasedonbirandomwalkontwoseparatednetworks AT mohdfarhanmdfudzee anentropybaseddirectedrandomwalkforcancerclassificationusinggeneexpressiondatabasedonbirandomwalkontwoseparatednetworks AT rohayantihassan anentropybaseddirectedrandomwalkforcancerclassificationusinggeneexpressiondatabasedonbirandomwalkontwoseparatednetworks AT emeliaakashahpatahakhir anentropybaseddirectedrandomwalkforcancerclassificationusinggeneexpressiondatabasedonbirandomwalkontwoseparatednetworks AT norshakirahaziz anentropybaseddirectedrandomwalkforcancerclassificationusinggeneexpressiondatabasedonbirandomwalkontwoseparatednetworks AT choonsenseah anentropybaseddirectedrandomwalkforcancerclassificationusinggeneexpressiondatabasedonbirandomwalkontwoseparatednetworks AT xinhuitay entropybaseddirectedrandomwalkforcancerclassificationusinggeneexpressiondatabasedonbirandomwalkontwoseparatednetworks AT shahreenkasim entropybaseddirectedrandomwalkforcancerclassificationusinggeneexpressiondatabasedonbirandomwalkontwoseparatednetworks AT tolesutikno entropybaseddirectedrandomwalkforcancerclassificationusinggeneexpressiondatabasedonbirandomwalkontwoseparatednetworks AT mohdfarhanmdfudzee entropybaseddirectedrandomwalkforcancerclassificationusinggeneexpressiondatabasedonbirandomwalkontwoseparatednetworks AT rohayantihassan entropybaseddirectedrandomwalkforcancerclassificationusinggeneexpressiondatabasedonbirandomwalkontwoseparatednetworks AT emeliaakashahpatahakhir entropybaseddirectedrandomwalkforcancerclassificationusinggeneexpressiondatabasedonbirandomwalkontwoseparatednetworks AT norshakirahaziz entropybaseddirectedrandomwalkforcancerclassificationusinggeneexpressiondatabasedonbirandomwalkontwoseparatednetworks AT choonsenseah entropybaseddirectedrandomwalkforcancerclassificationusinggeneexpressiondatabasedonbirandomwalkontwoseparatednetworks |