Gene selection for high dimensional data using k-means clustering algorithm and statistical approach

Microarray technology can measure thousands of genes which are useful for biologist to study and classify the cancer cells.However, this high dimensional data consists of large number of genes to be examined in regard of small samples size. Thus, selection of relevant genes is a challenging issue in...

Full description

Bibliographic Details
Main Authors:	Ahmad, Farzana Kabir, Yusof, Yuhanis, Othman, Nor Hayati
Format:	Conference or Workshop Item
Language:	English
Published:	2014
Subjects:	QA75 Electronic computers. Computer science
Online Access:	https://repo.uum.edu.my/id/eprint/16491/1/IEEE1.pdf

_version_	1803627195924480000
author	Ahmad, Farzana Kabir Yusof, Yuhanis Othman, Nor Hayati
author_facet	Ahmad, Farzana Kabir Yusof, Yuhanis Othman, Nor Hayati
author_sort	Ahmad, Farzana Kabir
collection	UUM
description	Microarray technology can measure thousands of genes which are useful for biologist to study and classify the cancer cells.However, this high dimensional data consists of large number of genes to be examined in regard of small samples size. Thus, selection of relevant genes is a challenging issue in microarray data analysis and has been a central research focus.This study proposed kmeans clustering algorithm to groups the relevant genes. Several statistical techniques such as Fisher criterion, Golub signal-to-noise, Mann Whitney rank and t-test have been used in deciding the clusters are well separated from one and others. Those genes with high discriminative score will later be used to train the k-NN classifier.The experimental results showed that the proposed gene selection methods able to identify differentially expressed genes with 0.86 ROC score.
first_indexed	2024-07-04T06:02:13Z
format	Conference or Workshop Item
id	uum-16491
institution	Universiti Utara Malaysia
language	English
last_indexed	2024-07-04T06:02:13Z
publishDate	2014
record_format	dspace
spelling	uum-164912016-04-27T07:19:08Z https://repo.uum.edu.my/id/eprint/16491/ Gene selection for high dimensional data using k-means clustering algorithm and statistical approach Ahmad, Farzana Kabir Yusof, Yuhanis Othman, Nor Hayati QA75 Electronic computers. Computer science Microarray technology can measure thousands of genes which are useful for biologist to study and classify the cancer cells.However, this high dimensional data consists of large number of genes to be examined in regard of small samples size. Thus, selection of relevant genes is a challenging issue in microarray data analysis and has been a central research focus.This study proposed kmeans clustering algorithm to groups the relevant genes. Several statistical techniques such as Fisher criterion, Golub signal-to-noise, Mann Whitney rank and t-test have been used in deciding the clusters are well separated from one and others. Those genes with high discriminative score will later be used to train the k-NN classifier.The experimental results showed that the proposed gene selection methods able to identify differentially expressed genes with 0.86 ROC score. 2014-08-27 Conference or Workshop Item PeerReviewed application/pdf en https://repo.uum.edu.my/id/eprint/16491/1/IEEE1.pdf Ahmad, Farzana Kabir and Yusof, Yuhanis and Othman, Nor Hayati (2014) Gene selection for high dimensional data using k-means clustering algorithm and statistical approach. In: International Conference on Computational Science and Technology (ICCST), 27-28 Aug. 2014, Kota Kinabalu. http://doi.org/10.1109/ICCST.2014.7045188 doi:10.1109/ICCST.2014.7045188 doi:10.1109/ICCST.2014.7045188
spellingShingle	QA75 Electronic computers. Computer science Ahmad, Farzana Kabir Yusof, Yuhanis Othman, Nor Hayati Gene selection for high dimensional data using k-means clustering algorithm and statistical approach
title	Gene selection for high dimensional data using k-means clustering algorithm and statistical approach
title_full	Gene selection for high dimensional data using k-means clustering algorithm and statistical approach
title_fullStr	Gene selection for high dimensional data using k-means clustering algorithm and statistical approach
title_full_unstemmed	Gene selection for high dimensional data using k-means clustering algorithm and statistical approach
title_short	Gene selection for high dimensional data using k-means clustering algorithm and statistical approach
title_sort	gene selection for high dimensional data using k means clustering algorithm and statistical approach
topic	QA75 Electronic computers. Computer science
url	https://repo.uum.edu.my/id/eprint/16491/1/IEEE1.pdf
work_keys_str_mv	AT ahmadfarzanakabir geneselectionforhighdimensionaldatausingkmeansclusteringalgorithmandstatisticalapproach AT yusofyuhanis geneselectionforhighdimensionaldatausingkmeansclusteringalgorithmandstatisticalapproach AT othmannorhayati geneselectionforhighdimensionaldatausingkmeansclusteringalgorithmandstatisticalapproach

Gene selection for high dimensional data using k-means clustering algorithm and statistical approach

Similar Items