Statistical Redundancy Testing for Improved Gene Selection in Cancer Classification Using Microarray Data

In gene selection for cancer classification using microarray data, we define an eigenvalue-ratio statistic to measure a gene's contribution to the joint discriminability when this gene is included into a set of genes. Based on this eigenvalue-ratio statistic, we define a novel hypothesis testin...

Full description

Bibliographic Details
Main Authors: Simin Hu, J. Sunil Rao
Format: Article
Language:English
Published: SAGE Publishing 2007-01-01
Series:Cancer Informatics
Online Access:https://doi.org/10.1177/117693510700300010
_version_ 1811286534958088192
author Simin Hu
J. Sunil Rao
author_facet Simin Hu
J. Sunil Rao
author_sort Simin Hu
collection DOAJ
description In gene selection for cancer classification using microarray data, we define an eigenvalue-ratio statistic to measure a gene's contribution to the joint discriminability when this gene is included into a set of genes. Based on this eigenvalue-ratio statistic, we define a novel hypothesis testing for gene statistical redundancy and propose two gene selection methods. Simulation studies illustrate the agreement between statistical redundancy testing and gene selection methods. Real data examples show the proposed gene selection methods can select a compact gene subset which can not only be used to build high quality cancer classifiers but also show biological relevance.
first_indexed 2024-04-13T03:01:42Z
format Article
id doaj.art-75e3a94fb6a7470bbf4aaf7826c5ebb7
institution Directory Open Access Journal
issn 1176-9351
language English
last_indexed 2024-04-13T03:01:42Z
publishDate 2007-01-01
publisher SAGE Publishing
record_format Article
series Cancer Informatics
spelling doaj.art-75e3a94fb6a7470bbf4aaf7826c5ebb72022-12-22T03:05:24ZengSAGE PublishingCancer Informatics1176-93512007-01-01310.1177/117693510700300010Statistical Redundancy Testing for Improved Gene Selection in Cancer Classification Using Microarray DataSimin Hu0J. Sunil Rao1Department of Epidemiology and Biostatistics, Case Western Reserve University, Cleveland, Ohio, 44106.Department of Epidemiology and Biostatistics, Case Western Reserve University, Cleveland, Ohio, 44106.In gene selection for cancer classification using microarray data, we define an eigenvalue-ratio statistic to measure a gene's contribution to the joint discriminability when this gene is included into a set of genes. Based on this eigenvalue-ratio statistic, we define a novel hypothesis testing for gene statistical redundancy and propose two gene selection methods. Simulation studies illustrate the agreement between statistical redundancy testing and gene selection methods. Real data examples show the proposed gene selection methods can select a compact gene subset which can not only be used to build high quality cancer classifiers but also show biological relevance.https://doi.org/10.1177/117693510700300010
spellingShingle Simin Hu
J. Sunil Rao
Statistical Redundancy Testing for Improved Gene Selection in Cancer Classification Using Microarray Data
Cancer Informatics
title Statistical Redundancy Testing for Improved Gene Selection in Cancer Classification Using Microarray Data
title_full Statistical Redundancy Testing for Improved Gene Selection in Cancer Classification Using Microarray Data
title_fullStr Statistical Redundancy Testing for Improved Gene Selection in Cancer Classification Using Microarray Data
title_full_unstemmed Statistical Redundancy Testing for Improved Gene Selection in Cancer Classification Using Microarray Data
title_short Statistical Redundancy Testing for Improved Gene Selection in Cancer Classification Using Microarray Data
title_sort statistical redundancy testing for improved gene selection in cancer classification using microarray data
url https://doi.org/10.1177/117693510700300010
work_keys_str_mv AT siminhu statisticalredundancytestingforimprovedgeneselectionincancerclassificationusingmicroarraydata
AT jsunilrao statisticalredundancytestingforimprovedgeneselectionincancerclassificationusingmicroarraydata