Enhanced Directed Random Walk for the Identification of Breast Cancer Prognostic Markers from Multiclass Expression Data

Artificial intelligence in healthcare can potentially identify the probability of contracting a particular disease more accurately. There are five common molecular subtypes of breast cancer: luminal A, luminal B, basal, ERBB2, and normal-like. Previous investigations showed that pathway-based microa...

Full description

Bibliographic Details
Main Authors: Hui Wen Nies, Mohd Saberi Mohamad, Zalmiyah Zakaria, Weng Howe Chan, Muhammad Akmal Remli, Yong Hui Nies
Format: Article
Language:English
Published: MDPI AG 2021-09-01
Series:Entropy
Subjects:
Online Access:https://www.mdpi.com/1099-4300/23/9/1232
_version_ 1797519340425183232
author Hui Wen Nies
Mohd Saberi Mohamad
Zalmiyah Zakaria
Weng Howe Chan
Muhammad Akmal Remli
Yong Hui Nies
author_facet Hui Wen Nies
Mohd Saberi Mohamad
Zalmiyah Zakaria
Weng Howe Chan
Muhammad Akmal Remli
Yong Hui Nies
author_sort Hui Wen Nies
collection DOAJ
description Artificial intelligence in healthcare can potentially identify the probability of contracting a particular disease more accurately. There are five common molecular subtypes of breast cancer: luminal A, luminal B, basal, ERBB2, and normal-like. Previous investigations showed that pathway-based microarray analysis could help in the identification of prognostic markers from gene expressions. For example, directed random walk (DRW) can infer a greater reproducibility power of the pathway activity between two classes of samples with a higher classification accuracy. However, most of the existing methods (including DRW) ignored the characteristics of different cancer subtypes and considered all of the pathways to contribute equally to the analysis. Therefore, an enhanced DRW (eDRW+) is proposed to identify breast cancer prognostic markers from multiclass expression data. An improved weight strategy using one-way ANOVA (F-test) and pathway selection based on the greatest reproducibility power is proposed in eDRW+. The experimental results show that the eDRW+ exceeds other methods in terms of AUC. Besides this, the eDRW+ identifies 294 gene markers and 45 pathway markers from the breast cancer datasets with better AUC. Therefore, the prognostic markers (pathway markers and gene markers) can identify drug targets and look for cancer subtypes with clinically distinct outcomes.
first_indexed 2024-03-10T07:41:28Z
format Article
id doaj.art-4c839ade2ff94571865370d562ebd557
institution Directory Open Access Journal
issn 1099-4300
language English
last_indexed 2024-03-10T07:41:28Z
publishDate 2021-09-01
publisher MDPI AG
record_format Article
series Entropy
spelling doaj.art-4c839ade2ff94571865370d562ebd5572023-11-22T12:58:42ZengMDPI AGEntropy1099-43002021-09-01239123210.3390/e23091232Enhanced Directed Random Walk for the Identification of Breast Cancer Prognostic Markers from Multiclass Expression DataHui Wen Nies0Mohd Saberi Mohamad1Zalmiyah Zakaria2Weng Howe Chan3Muhammad Akmal Remli4Yong Hui Nies5School of Computing, Faculty of Engineering, Universiti Teknologi Malaysia, Skudai 81310, MalaysiaHealth Data Science Lab, Department of Genetics and Genomics, College of Medical and Health Sciences, United Arab Emirates University, Al Ain 17666, United Arab EmiratesSchool of Computing, Faculty of Engineering, Universiti Teknologi Malaysia, Skudai 81310, MalaysiaSchool of Computing, Faculty of Engineering, Universiti Teknologi Malaysia, Skudai 81310, MalaysiaInstitute for Artificial Intelligence and Big Data, Universiti Malaysia Kelantan, Kota Bharu 16100, MalaysiaDepartment of Anatomy, Faculty of Medicine, Universiti Kebangsaan Malaysia Medical Centre, Cheras, Kuala Lumpur 56000, MalaysiaArtificial intelligence in healthcare can potentially identify the probability of contracting a particular disease more accurately. There are five common molecular subtypes of breast cancer: luminal A, luminal B, basal, ERBB2, and normal-like. Previous investigations showed that pathway-based microarray analysis could help in the identification of prognostic markers from gene expressions. For example, directed random walk (DRW) can infer a greater reproducibility power of the pathway activity between two classes of samples with a higher classification accuracy. However, most of the existing methods (including DRW) ignored the characteristics of different cancer subtypes and considered all of the pathways to contribute equally to the analysis. Therefore, an enhanced DRW (eDRW+) is proposed to identify breast cancer prognostic markers from multiclass expression data. An improved weight strategy using one-way ANOVA (F-test) and pathway selection based on the greatest reproducibility power is proposed in eDRW+. The experimental results show that the eDRW+ exceeds other methods in terms of AUC. Besides this, the eDRW+ identifies 294 gene markers and 45 pathway markers from the breast cancer datasets with better AUC. Therefore, the prognostic markers (pathway markers and gene markers) can identify drug targets and look for cancer subtypes with clinically distinct outcomes.https://www.mdpi.com/1099-4300/23/9/1232prognostic markersbreast cancermulticlassmicroarray analysisANOVApathway selection
spellingShingle Hui Wen Nies
Mohd Saberi Mohamad
Zalmiyah Zakaria
Weng Howe Chan
Muhammad Akmal Remli
Yong Hui Nies
Enhanced Directed Random Walk for the Identification of Breast Cancer Prognostic Markers from Multiclass Expression Data
Entropy
prognostic markers
breast cancer
multiclass
microarray analysis
ANOVA
pathway selection
title Enhanced Directed Random Walk for the Identification of Breast Cancer Prognostic Markers from Multiclass Expression Data
title_full Enhanced Directed Random Walk for the Identification of Breast Cancer Prognostic Markers from Multiclass Expression Data
title_fullStr Enhanced Directed Random Walk for the Identification of Breast Cancer Prognostic Markers from Multiclass Expression Data
title_full_unstemmed Enhanced Directed Random Walk for the Identification of Breast Cancer Prognostic Markers from Multiclass Expression Data
title_short Enhanced Directed Random Walk for the Identification of Breast Cancer Prognostic Markers from Multiclass Expression Data
title_sort enhanced directed random walk for the identification of breast cancer prognostic markers from multiclass expression data
topic prognostic markers
breast cancer
multiclass
microarray analysis
ANOVA
pathway selection
url https://www.mdpi.com/1099-4300/23/9/1232
work_keys_str_mv AT huiwennies enhanceddirectedrandomwalkfortheidentificationofbreastcancerprognosticmarkersfrommulticlassexpressiondata
AT mohdsaberimohamad enhanceddirectedrandomwalkfortheidentificationofbreastcancerprognosticmarkersfrommulticlassexpressiondata
AT zalmiyahzakaria enhanceddirectedrandomwalkfortheidentificationofbreastcancerprognosticmarkersfrommulticlassexpressiondata
AT wenghowechan enhanceddirectedrandomwalkfortheidentificationofbreastcancerprognosticmarkersfrommulticlassexpressiondata
AT muhammadakmalremli enhanceddirectedrandomwalkfortheidentificationofbreastcancerprognosticmarkersfrommulticlassexpressiondata
AT yonghuinies enhanceddirectedrandomwalkfortheidentificationofbreastcancerprognosticmarkersfrommulticlassexpressiondata