Bi-dimensional principal gene feature selection from big gene expression data

Gene expression sample data, which usually contains massive expression profiles of genes, is commonly used for disease related gene analysis. The selection of relevant genes from huge amount of genes is always a fundamental process in applications of gene expression data. As more and more genes have...

Full description

Bibliographic Details
Main Authors: Xiaoqian Hou, Jingyu Hou, Guangyan Huang
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2022-01-01
Series:PLoS ONE
Online Access:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9728919/?tool=EBI
_version_ 1811184456291057664
author Xiaoqian Hou
Jingyu Hou
Guangyan Huang
author_facet Xiaoqian Hou
Jingyu Hou
Guangyan Huang
author_sort Xiaoqian Hou
collection DOAJ
description Gene expression sample data, which usually contains massive expression profiles of genes, is commonly used for disease related gene analysis. The selection of relevant genes from huge amount of genes is always a fundamental process in applications of gene expression data. As more and more genes have been detected, the size of gene expression data becomes larger and larger; this challenges the computing efficiency for extracting the relevant and important genes from gene expression data. In this paper, we provide a novel Bi-dimensional Principal Feature Selection (BPFS) method for efficiently extracting critical genes from big gene expression data. It applies the principal component analysis (PCA) method on sample and gene domains successively, aiming at extracting the relevant gene features and reducing redundancies while losing less information. The experimental results on four real-world cancer gene expression datasets show that the proposed BPFS method greatly reduces the data size and achieves a nearly double processing speed compared to the counterpart methods, while maintaining better accuracy and effectiveness.
first_indexed 2024-04-11T13:13:40Z
format Article
id doaj.art-80813ab0938c4ff6aa5dab3acd4779d6
institution Directory Open Access Journal
issn 1932-6203
language English
last_indexed 2024-04-11T13:13:40Z
publishDate 2022-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj.art-80813ab0938c4ff6aa5dab3acd4779d62022-12-22T04:22:28ZengPublic Library of Science (PLoS)PLoS ONE1932-62032022-01-011712Bi-dimensional principal gene feature selection from big gene expression dataXiaoqian HouJingyu HouGuangyan HuangGene expression sample data, which usually contains massive expression profiles of genes, is commonly used for disease related gene analysis. The selection of relevant genes from huge amount of genes is always a fundamental process in applications of gene expression data. As more and more genes have been detected, the size of gene expression data becomes larger and larger; this challenges the computing efficiency for extracting the relevant and important genes from gene expression data. In this paper, we provide a novel Bi-dimensional Principal Feature Selection (BPFS) method for efficiently extracting critical genes from big gene expression data. It applies the principal component analysis (PCA) method on sample and gene domains successively, aiming at extracting the relevant gene features and reducing redundancies while losing less information. The experimental results on four real-world cancer gene expression datasets show that the proposed BPFS method greatly reduces the data size and achieves a nearly double processing speed compared to the counterpart methods, while maintaining better accuracy and effectiveness.https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9728919/?tool=EBI
spellingShingle Xiaoqian Hou
Jingyu Hou
Guangyan Huang
Bi-dimensional principal gene feature selection from big gene expression data
PLoS ONE
title Bi-dimensional principal gene feature selection from big gene expression data
title_full Bi-dimensional principal gene feature selection from big gene expression data
title_fullStr Bi-dimensional principal gene feature selection from big gene expression data
title_full_unstemmed Bi-dimensional principal gene feature selection from big gene expression data
title_short Bi-dimensional principal gene feature selection from big gene expression data
title_sort bi dimensional principal gene feature selection from big gene expression data
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9728919/?tool=EBI
work_keys_str_mv AT xiaoqianhou bidimensionalprincipalgenefeatureselectionfrombiggeneexpressiondata
AT jingyuhou bidimensionalprincipalgenefeatureselectionfrombiggeneexpressiondata
AT guangyanhuang bidimensionalprincipalgenefeatureselectionfrombiggeneexpressiondata