Enhanced Directed Random Walk for the Identification of Breast Cancer Prognostic Markers from Multiclass Expression Data
Artificial intelligence in healthcare can potentially identify the probability of contracting a particular disease more accurately. There are five common molecular subtypes of breast cancer: luminal A, luminal B, basal, ERBB2, and normal-like. Previous investigations showed that pathway-based microa...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2021-09-01
|
Series: | Entropy |
Subjects: | |
Online Access: | https://www.mdpi.com/1099-4300/23/9/1232 |
_version_ | 1797519340425183232 |
---|---|
author | Hui Wen Nies Mohd Saberi Mohamad Zalmiyah Zakaria Weng Howe Chan Muhammad Akmal Remli Yong Hui Nies |
author_facet | Hui Wen Nies Mohd Saberi Mohamad Zalmiyah Zakaria Weng Howe Chan Muhammad Akmal Remli Yong Hui Nies |
author_sort | Hui Wen Nies |
collection | DOAJ |
description | Artificial intelligence in healthcare can potentially identify the probability of contracting a particular disease more accurately. There are five common molecular subtypes of breast cancer: luminal A, luminal B, basal, ERBB2, and normal-like. Previous investigations showed that pathway-based microarray analysis could help in the identification of prognostic markers from gene expressions. For example, directed random walk (DRW) can infer a greater reproducibility power of the pathway activity between two classes of samples with a higher classification accuracy. However, most of the existing methods (including DRW) ignored the characteristics of different cancer subtypes and considered all of the pathways to contribute equally to the analysis. Therefore, an enhanced DRW (eDRW+) is proposed to identify breast cancer prognostic markers from multiclass expression data. An improved weight strategy using one-way ANOVA (F-test) and pathway selection based on the greatest reproducibility power is proposed in eDRW+. The experimental results show that the eDRW+ exceeds other methods in terms of AUC. Besides this, the eDRW+ identifies 294 gene markers and 45 pathway markers from the breast cancer datasets with better AUC. Therefore, the prognostic markers (pathway markers and gene markers) can identify drug targets and look for cancer subtypes with clinically distinct outcomes. |
first_indexed | 2024-03-10T07:41:28Z |
format | Article |
id | doaj.art-4c839ade2ff94571865370d562ebd557 |
institution | Directory Open Access Journal |
issn | 1099-4300 |
language | English |
last_indexed | 2024-03-10T07:41:28Z |
publishDate | 2021-09-01 |
publisher | MDPI AG |
record_format | Article |
series | Entropy |
spelling | doaj.art-4c839ade2ff94571865370d562ebd5572023-11-22T12:58:42ZengMDPI AGEntropy1099-43002021-09-01239123210.3390/e23091232Enhanced Directed Random Walk for the Identification of Breast Cancer Prognostic Markers from Multiclass Expression DataHui Wen Nies0Mohd Saberi Mohamad1Zalmiyah Zakaria2Weng Howe Chan3Muhammad Akmal Remli4Yong Hui Nies5School of Computing, Faculty of Engineering, Universiti Teknologi Malaysia, Skudai 81310, MalaysiaHealth Data Science Lab, Department of Genetics and Genomics, College of Medical and Health Sciences, United Arab Emirates University, Al Ain 17666, United Arab EmiratesSchool of Computing, Faculty of Engineering, Universiti Teknologi Malaysia, Skudai 81310, MalaysiaSchool of Computing, Faculty of Engineering, Universiti Teknologi Malaysia, Skudai 81310, MalaysiaInstitute for Artificial Intelligence and Big Data, Universiti Malaysia Kelantan, Kota Bharu 16100, MalaysiaDepartment of Anatomy, Faculty of Medicine, Universiti Kebangsaan Malaysia Medical Centre, Cheras, Kuala Lumpur 56000, MalaysiaArtificial intelligence in healthcare can potentially identify the probability of contracting a particular disease more accurately. There are five common molecular subtypes of breast cancer: luminal A, luminal B, basal, ERBB2, and normal-like. Previous investigations showed that pathway-based microarray analysis could help in the identification of prognostic markers from gene expressions. For example, directed random walk (DRW) can infer a greater reproducibility power of the pathway activity between two classes of samples with a higher classification accuracy. However, most of the existing methods (including DRW) ignored the characteristics of different cancer subtypes and considered all of the pathways to contribute equally to the analysis. Therefore, an enhanced DRW (eDRW+) is proposed to identify breast cancer prognostic markers from multiclass expression data. An improved weight strategy using one-way ANOVA (F-test) and pathway selection based on the greatest reproducibility power is proposed in eDRW+. The experimental results show that the eDRW+ exceeds other methods in terms of AUC. Besides this, the eDRW+ identifies 294 gene markers and 45 pathway markers from the breast cancer datasets with better AUC. Therefore, the prognostic markers (pathway markers and gene markers) can identify drug targets and look for cancer subtypes with clinically distinct outcomes.https://www.mdpi.com/1099-4300/23/9/1232prognostic markersbreast cancermulticlassmicroarray analysisANOVApathway selection |
spellingShingle | Hui Wen Nies Mohd Saberi Mohamad Zalmiyah Zakaria Weng Howe Chan Muhammad Akmal Remli Yong Hui Nies Enhanced Directed Random Walk for the Identification of Breast Cancer Prognostic Markers from Multiclass Expression Data Entropy prognostic markers breast cancer multiclass microarray analysis ANOVA pathway selection |
title | Enhanced Directed Random Walk for the Identification of Breast Cancer Prognostic Markers from Multiclass Expression Data |
title_full | Enhanced Directed Random Walk for the Identification of Breast Cancer Prognostic Markers from Multiclass Expression Data |
title_fullStr | Enhanced Directed Random Walk for the Identification of Breast Cancer Prognostic Markers from Multiclass Expression Data |
title_full_unstemmed | Enhanced Directed Random Walk for the Identification of Breast Cancer Prognostic Markers from Multiclass Expression Data |
title_short | Enhanced Directed Random Walk for the Identification of Breast Cancer Prognostic Markers from Multiclass Expression Data |
title_sort | enhanced directed random walk for the identification of breast cancer prognostic markers from multiclass expression data |
topic | prognostic markers breast cancer multiclass microarray analysis ANOVA pathway selection |
url | https://www.mdpi.com/1099-4300/23/9/1232 |
work_keys_str_mv | AT huiwennies enhanceddirectedrandomwalkfortheidentificationofbreastcancerprognosticmarkersfrommulticlassexpressiondata AT mohdsaberimohamad enhanceddirectedrandomwalkfortheidentificationofbreastcancerprognosticmarkersfrommulticlassexpressiondata AT zalmiyahzakaria enhanceddirectedrandomwalkfortheidentificationofbreastcancerprognosticmarkersfrommulticlassexpressiondata AT wenghowechan enhanceddirectedrandomwalkfortheidentificationofbreastcancerprognosticmarkersfrommulticlassexpressiondata AT muhammadakmalremli enhanceddirectedrandomwalkfortheidentificationofbreastcancerprognosticmarkersfrommulticlassexpressiondata AT yonghuinies enhanceddirectedrandomwalkfortheidentificationofbreastcancerprognosticmarkersfrommulticlassexpressiondata |