Handling Big Microarray Data: A Novel Approach to Design Accurate Fuzzy-Based Medical Expert System

The genes data produced by microarray experiments is complex in terms of dimensions and samples. It consumes a lot of computation power and time when it is processed for a disease analysis while working with an expert system. At the same time, data can help doctors identify a patient’s he...

Full description

Bibliographic Details
Main Authors:	Ganeshkumar Pugalendhi, M. Mazhar Rathore, Dhirendra Shukla, Anand Paul
Format:	Article
Language:	English
Published:	IEEE 2023-01-01
Series:	IEEE Access
Subjects:	f-information fuzzy expert system microarray data particle swarm optimization
Online Access:	https://ieeexplore.ieee.org/document/10072401/

_version_	1797847342538293248
author	Ganeshkumar Pugalendhi M. Mazhar Rathore Dhirendra Shukla Anand Paul
author_facet	Ganeshkumar Pugalendhi M. Mazhar Rathore Dhirendra Shukla Anand Paul
author_sort	Ganeshkumar Pugalendhi
collection	DOAJ
description	The genes data produced by microarray experiments is complex in terms of dimensions and samples. It consumes a lot of computation power and time when it is processed for a disease analysis while working with an expert system. At the same time, data can help doctors identify a patient’s health condition if it is presented in a meaningful way and processed on time. Several methods have been proposed to reduce the dimensions of medical microarray data and optimize its search space with minimal accuracy loss. However, the discretization of continuous gene-values in the process of dimension reduction is failed to preserve the inherent meaning of genes. Also, ensuring high accuracy and interpretability in the reduction process may result in extra processing time, which is unfavorable for time-critical applications. To overcome these issues, in this paper, we propose a dimension reduction method in conjunction with a fuzzy expert system (FES) optimization approach, while keeping an accuracy-interpretability-speedy tradeoff in mind. To accomplish this, we use a fuzzy rough set on <inline-formula> <tex-math notation="LaTeX">${f}$ </tex-math></inline-formula>-information to identify meaningful genes without changing their original values. We propose a conditionally guided particle swarm optimization for faster knowledge acquisition, where the velocity is adjusted based on a predefined update probability, resulting in a faster search. A big data processing architecture is designed using the Hadoop ecosystem along with a <inline-formula> <tex-math notation="LaTeX">$MapReduce$ </tex-math></inline-formula>-equivalent algorithm of the proposed method for speedy processing, enabling parallel processing on microarray data to reduce dimensions and perform classification through knowledge extraction. The proposed method is thoroughly tested on eleven microarray datasets by considering accuracy-interpretability-speed tradeoff. The results show that the proposed method is effective in identifying disease-causing genes while also understanding the patient’s genetic profile with only a few operations and a small amount of CPU time. Statistical tests are also run to validate the proposed method’s efficacy in comparison to other methods.
first_indexed	2024-04-09T18:09:47Z
format	Article
id	doaj.art-2065c73aabb64e94ae338f272c2ad18d
institution	Directory Open Access Journal
issn	2169-3536
language	English
last_indexed	2024-04-09T18:09:47Z
publishDate	2023-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj.art-2065c73aabb64e94ae338f272c2ad18d2023-04-13T23:01:04ZengIEEEIEEE Access2169-35362023-01-0111351823519610.1109/ACCESS.2023.325787510072401Handling Big Microarray Data: A Novel Approach to Design Accurate Fuzzy-Based Medical Expert SystemGaneshkumar Pugalendhi0https://orcid.org/0000-0001-8681-8169M. Mazhar Rathore1Dhirendra Shukla2https://orcid.org/0000-0002-0036-714XAnand Paul3https://orcid.org/0009-0001-2119-5148Department of Information Technology, Anna University Regional Campus, Coimbatore, IndiaDr. J. Herbert Smith Centre, University of New Brunswick, Fredericton, CanadaDr. J. Herbert Smith Centre, University of New Brunswick, Fredericton, CanadaSchool of Computer Science and Engineering, Kyungpook National University, Daegu, South KoreaThe genes data produced by microarray experiments is complex in terms of dimensions and samples. It consumes a lot of computation power and time when it is processed for a disease analysis while working with an expert system. At the same time, data can help doctors identify a patient’s health condition if it is presented in a meaningful way and processed on time. Several methods have been proposed to reduce the dimensions of medical microarray data and optimize its search space with minimal accuracy loss. However, the discretization of continuous gene-values in the process of dimension reduction is failed to preserve the inherent meaning of genes. Also, ensuring high accuracy and interpretability in the reduction process may result in extra processing time, which is unfavorable for time-critical applications. To overcome these issues, in this paper, we propose a dimension reduction method in conjunction with a fuzzy expert system (FES) optimization approach, while keeping an accuracy-interpretability-speedy tradeoff in mind. To accomplish this, we use a fuzzy rough set on <inline-formula> <tex-math notation="LaTeX">${f}$ </tex-math></inline-formula>-information to identify meaningful genes without changing their original values. We propose a conditionally guided particle swarm optimization for faster knowledge acquisition, where the velocity is adjusted based on a predefined update probability, resulting in a faster search. A big data processing architecture is designed using the Hadoop ecosystem along with a <inline-formula> <tex-math notation="LaTeX">$MapReduce$ </tex-math></inline-formula>-equivalent algorithm of the proposed method for speedy processing, enabling parallel processing on microarray data to reduce dimensions and perform classification through knowledge extraction. The proposed method is thoroughly tested on eleven microarray datasets by considering accuracy-interpretability-speed tradeoff. The results show that the proposed method is effective in identifying disease-causing genes while also understanding the patient’s genetic profile with only a few operations and a small amount of CPU time. Statistical tests are also run to validate the proposed method’s efficacy in comparison to other methods.https://ieeexplore.ieee.org/document/10072401/f-informationfuzzy expert systemmicroarray dataparticle swarm optimization
spellingShingle	Ganeshkumar Pugalendhi M. Mazhar Rathore Dhirendra Shukla Anand Paul Handling Big Microarray Data: A Novel Approach to Design Accurate Fuzzy-Based Medical Expert System IEEE Access f-information fuzzy expert system microarray data particle swarm optimization
title	Handling Big Microarray Data: A Novel Approach to Design Accurate Fuzzy-Based Medical Expert System
title_full	Handling Big Microarray Data: A Novel Approach to Design Accurate Fuzzy-Based Medical Expert System
title_fullStr	Handling Big Microarray Data: A Novel Approach to Design Accurate Fuzzy-Based Medical Expert System
title_full_unstemmed	Handling Big Microarray Data: A Novel Approach to Design Accurate Fuzzy-Based Medical Expert System
title_short	Handling Big Microarray Data: A Novel Approach to Design Accurate Fuzzy-Based Medical Expert System
title_sort	handling big microarray data a novel approach to design accurate fuzzy based medical expert system
topic	f-information fuzzy expert system microarray data particle swarm optimization
url	https://ieeexplore.ieee.org/document/10072401/
work_keys_str_mv	AT ganeshkumarpugalendhi handlingbigmicroarraydataanovelapproachtodesignaccuratefuzzybasedmedicalexpertsystem AT mmazharrathore handlingbigmicroarraydataanovelapproachtodesignaccuratefuzzybasedmedicalexpertsystem AT dhirendrashukla handlingbigmicroarraydataanovelapproachtodesignaccuratefuzzybasedmedicalexpertsystem AT anandpaul handlingbigmicroarraydataanovelapproachtodesignaccuratefuzzybasedmedicalexpertsystem

Handling Big Microarray Data: A Novel Approach to Design Accurate Fuzzy-Based Medical Expert System

Similar Items