An Entropy-Based Directed Random Walk for Cancer Classification Using Gene Expression Data Based on Bi-Random Walk on Two Separated Networks

The integration of microarray technologies and machine learning methods has become popular in predicting the pathological condition of diseases and discovering risk genes. Traditional microarray analysis considers pathways as a simple gene set, treating all genes in the pathway identically while ign...

Full description

Bibliographic Details
Main Authors: Xin Hui Tay, Shahreen Kasim, Tole Sutikno, Mohd Farhan Md Fudzee, Rohayanti Hassan, Emelia Akashah Patah Akhir, Norshakirah Aziz, Choon Sen Seah
Format: Article
Language:English
Published: MDPI AG 2023-02-01
Series:Genes
Subjects:
Online Access:https://www.mdpi.com/2073-4425/14/3/574
_version_ 1797611549143072768
author Xin Hui Tay
Shahreen Kasim
Tole Sutikno
Mohd Farhan Md Fudzee
Rohayanti Hassan
Emelia Akashah Patah Akhir
Norshakirah Aziz
Choon Sen Seah
author_facet Xin Hui Tay
Shahreen Kasim
Tole Sutikno
Mohd Farhan Md Fudzee
Rohayanti Hassan
Emelia Akashah Patah Akhir
Norshakirah Aziz
Choon Sen Seah
author_sort Xin Hui Tay
collection DOAJ
description The integration of microarray technologies and machine learning methods has become popular in predicting the pathological condition of diseases and discovering risk genes. Traditional microarray analysis considers pathways as a simple gene set, treating all genes in the pathway identically while ignoring the pathway network’s structure information. This study proposed an entropy-based directed random walk (e-DRW) method to infer pathway activities. Two enhancements from the conventional DRW were conducted, which are (1) to increase the coverage of human pathway information by constructing two inputting networks for pathway activity inference, and (2) to enhance the gene-weighting method in DRW by incorporating correlation coefficient values and <i>t</i>-test statistic scores. To test the objectives, gene expression datasets were used as input datasets while the pathway datasets were used as reference datasets to build two directed graphs. The within-dataset experiments indicated that e-DRW method demonstrated robust and superior performance in terms of classification accuracy and robustness of the predicted risk-active pathways compared to the other methods. In conclusion, the results revealed that e-DRW not only improved the prediction performance, but also effectively extracted topologically important pathways and genes that were specifically related to the corresponding cancer types.
first_indexed 2024-03-11T06:30:14Z
format Article
id doaj.art-9e7be410bdea4d53b684117302ce97d4
institution Directory Open Access Journal
issn 2073-4425
language English
last_indexed 2024-03-11T06:30:14Z
publishDate 2023-02-01
publisher MDPI AG
record_format Article
series Genes
spelling doaj.art-9e7be410bdea4d53b684117302ce97d42023-11-17T11:16:27ZengMDPI AGGenes2073-44252023-02-0114357410.3390/genes14030574An Entropy-Based Directed Random Walk for Cancer Classification Using Gene Expression Data Based on Bi-Random Walk on Two Separated NetworksXin Hui Tay0Shahreen Kasim1Tole Sutikno2Mohd Farhan Md Fudzee3Rohayanti Hassan4Emelia Akashah Patah Akhir5Norshakirah Aziz6Choon Sen Seah7Faculty of Computer Sciences and Information Technology, Universiti Tun Hussein Onn Malaysia, Batu Pahat 83000, MalaysiaFaculty of Computer Sciences and Information Technology, Universiti Tun Hussein Onn Malaysia, Batu Pahat 83000, MalaysiaDepartment of Electrical Engineering, Universitas Ahmad Dahlan, Yogyakarta 55166, IndonesiaFaculty of Computer Sciences and Information Technology, Universiti Tun Hussein Onn Malaysia, Batu Pahat 83000, MalaysiaSchool of Computing, Faculty of Electrical Engineering, Universiti Teknologi Malaysia, Skudai 81310, MalaysiaDepartment of Computer and Information Sciences, Universiti Teknologi Petronas, Seri Iskandar 32610, MalaysiaDepartment of Computer and Information Sciences, Universiti Teknologi Petronas, Seri Iskandar 32610, MalaysiaFaculty of Accounting & Management, Universiti Tunku Abdul Rahman, Kajang 43000, MalaysiaThe integration of microarray technologies and machine learning methods has become popular in predicting the pathological condition of diseases and discovering risk genes. Traditional microarray analysis considers pathways as a simple gene set, treating all genes in the pathway identically while ignoring the pathway network’s structure information. This study proposed an entropy-based directed random walk (e-DRW) method to infer pathway activities. Two enhancements from the conventional DRW were conducted, which are (1) to increase the coverage of human pathway information by constructing two inputting networks for pathway activity inference, and (2) to enhance the gene-weighting method in DRW by incorporating correlation coefficient values and <i>t</i>-test statistic scores. To test the objectives, gene expression datasets were used as input datasets while the pathway datasets were used as reference datasets to build two directed graphs. The within-dataset experiments indicated that e-DRW method demonstrated robust and superior performance in terms of classification accuracy and robustness of the predicted risk-active pathways compared to the other methods. In conclusion, the results revealed that e-DRW not only improved the prediction performance, but also effectively extracted topologically important pathways and genes that were specifically related to the corresponding cancer types.https://www.mdpi.com/2073-4425/14/3/574directed random walkpathway-based analysiscancer classification
spellingShingle Xin Hui Tay
Shahreen Kasim
Tole Sutikno
Mohd Farhan Md Fudzee
Rohayanti Hassan
Emelia Akashah Patah Akhir
Norshakirah Aziz
Choon Sen Seah
An Entropy-Based Directed Random Walk for Cancer Classification Using Gene Expression Data Based on Bi-Random Walk on Two Separated Networks
Genes
directed random walk
pathway-based analysis
cancer classification
title An Entropy-Based Directed Random Walk for Cancer Classification Using Gene Expression Data Based on Bi-Random Walk on Two Separated Networks
title_full An Entropy-Based Directed Random Walk for Cancer Classification Using Gene Expression Data Based on Bi-Random Walk on Two Separated Networks
title_fullStr An Entropy-Based Directed Random Walk for Cancer Classification Using Gene Expression Data Based on Bi-Random Walk on Two Separated Networks
title_full_unstemmed An Entropy-Based Directed Random Walk for Cancer Classification Using Gene Expression Data Based on Bi-Random Walk on Two Separated Networks
title_short An Entropy-Based Directed Random Walk for Cancer Classification Using Gene Expression Data Based on Bi-Random Walk on Two Separated Networks
title_sort entropy based directed random walk for cancer classification using gene expression data based on bi random walk on two separated networks
topic directed random walk
pathway-based analysis
cancer classification
url https://www.mdpi.com/2073-4425/14/3/574
work_keys_str_mv AT xinhuitay anentropybaseddirectedrandomwalkforcancerclassificationusinggeneexpressiondatabasedonbirandomwalkontwoseparatednetworks
AT shahreenkasim anentropybaseddirectedrandomwalkforcancerclassificationusinggeneexpressiondatabasedonbirandomwalkontwoseparatednetworks
AT tolesutikno anentropybaseddirectedrandomwalkforcancerclassificationusinggeneexpressiondatabasedonbirandomwalkontwoseparatednetworks
AT mohdfarhanmdfudzee anentropybaseddirectedrandomwalkforcancerclassificationusinggeneexpressiondatabasedonbirandomwalkontwoseparatednetworks
AT rohayantihassan anentropybaseddirectedrandomwalkforcancerclassificationusinggeneexpressiondatabasedonbirandomwalkontwoseparatednetworks
AT emeliaakashahpatahakhir anentropybaseddirectedrandomwalkforcancerclassificationusinggeneexpressiondatabasedonbirandomwalkontwoseparatednetworks
AT norshakirahaziz anentropybaseddirectedrandomwalkforcancerclassificationusinggeneexpressiondatabasedonbirandomwalkontwoseparatednetworks
AT choonsenseah anentropybaseddirectedrandomwalkforcancerclassificationusinggeneexpressiondatabasedonbirandomwalkontwoseparatednetworks
AT xinhuitay entropybaseddirectedrandomwalkforcancerclassificationusinggeneexpressiondatabasedonbirandomwalkontwoseparatednetworks
AT shahreenkasim entropybaseddirectedrandomwalkforcancerclassificationusinggeneexpressiondatabasedonbirandomwalkontwoseparatednetworks
AT tolesutikno entropybaseddirectedrandomwalkforcancerclassificationusinggeneexpressiondatabasedonbirandomwalkontwoseparatednetworks
AT mohdfarhanmdfudzee entropybaseddirectedrandomwalkforcancerclassificationusinggeneexpressiondatabasedonbirandomwalkontwoseparatednetworks
AT rohayantihassan entropybaseddirectedrandomwalkforcancerclassificationusinggeneexpressiondatabasedonbirandomwalkontwoseparatednetworks
AT emeliaakashahpatahakhir entropybaseddirectedrandomwalkforcancerclassificationusinggeneexpressiondatabasedonbirandomwalkontwoseparatednetworks
AT norshakirahaziz entropybaseddirectedrandomwalkforcancerclassificationusinggeneexpressiondatabasedonbirandomwalkontwoseparatednetworks
AT choonsenseah entropybaseddirectedrandomwalkforcancerclassificationusinggeneexpressiondatabasedonbirandomwalkontwoseparatednetworks