Bi-dimensional principal gene feature selection from big gene expression data
Gene expression sample data, which usually contains massive expression profiles of genes, is commonly used for disease related gene analysis. The selection of relevant genes from huge amount of genes is always a fundamental process in applications of gene expression data. As more and more genes have...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Public Library of Science (PLoS)
2022-01-01
|
Series: | PLoS ONE |
Online Access: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9728919/?tool=EBI |
_version_ | 1811184456291057664 |
---|---|
author | Xiaoqian Hou Jingyu Hou Guangyan Huang |
author_facet | Xiaoqian Hou Jingyu Hou Guangyan Huang |
author_sort | Xiaoqian Hou |
collection | DOAJ |
description | Gene expression sample data, which usually contains massive expression profiles of genes, is commonly used for disease related gene analysis. The selection of relevant genes from huge amount of genes is always a fundamental process in applications of gene expression data. As more and more genes have been detected, the size of gene expression data becomes larger and larger; this challenges the computing efficiency for extracting the relevant and important genes from gene expression data. In this paper, we provide a novel Bi-dimensional Principal Feature Selection (BPFS) method for efficiently extracting critical genes from big gene expression data. It applies the principal component analysis (PCA) method on sample and gene domains successively, aiming at extracting the relevant gene features and reducing redundancies while losing less information. The experimental results on four real-world cancer gene expression datasets show that the proposed BPFS method greatly reduces the data size and achieves a nearly double processing speed compared to the counterpart methods, while maintaining better accuracy and effectiveness. |
first_indexed | 2024-04-11T13:13:40Z |
format | Article |
id | doaj.art-80813ab0938c4ff6aa5dab3acd4779d6 |
institution | Directory Open Access Journal |
issn | 1932-6203 |
language | English |
last_indexed | 2024-04-11T13:13:40Z |
publishDate | 2022-01-01 |
publisher | Public Library of Science (PLoS) |
record_format | Article |
series | PLoS ONE |
spelling | doaj.art-80813ab0938c4ff6aa5dab3acd4779d62022-12-22T04:22:28ZengPublic Library of Science (PLoS)PLoS ONE1932-62032022-01-011712Bi-dimensional principal gene feature selection from big gene expression dataXiaoqian HouJingyu HouGuangyan HuangGene expression sample data, which usually contains massive expression profiles of genes, is commonly used for disease related gene analysis. The selection of relevant genes from huge amount of genes is always a fundamental process in applications of gene expression data. As more and more genes have been detected, the size of gene expression data becomes larger and larger; this challenges the computing efficiency for extracting the relevant and important genes from gene expression data. In this paper, we provide a novel Bi-dimensional Principal Feature Selection (BPFS) method for efficiently extracting critical genes from big gene expression data. It applies the principal component analysis (PCA) method on sample and gene domains successively, aiming at extracting the relevant gene features and reducing redundancies while losing less information. The experimental results on four real-world cancer gene expression datasets show that the proposed BPFS method greatly reduces the data size and achieves a nearly double processing speed compared to the counterpart methods, while maintaining better accuracy and effectiveness.https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9728919/?tool=EBI |
spellingShingle | Xiaoqian Hou Jingyu Hou Guangyan Huang Bi-dimensional principal gene feature selection from big gene expression data PLoS ONE |
title | Bi-dimensional principal gene feature selection from big gene expression data |
title_full | Bi-dimensional principal gene feature selection from big gene expression data |
title_fullStr | Bi-dimensional principal gene feature selection from big gene expression data |
title_full_unstemmed | Bi-dimensional principal gene feature selection from big gene expression data |
title_short | Bi-dimensional principal gene feature selection from big gene expression data |
title_sort | bi dimensional principal gene feature selection from big gene expression data |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9728919/?tool=EBI |
work_keys_str_mv | AT xiaoqianhou bidimensionalprincipalgenefeatureselectionfrombiggeneexpressiondata AT jingyuhou bidimensionalprincipalgenefeatureselectionfrombiggeneexpressiondata AT guangyanhuang bidimensionalprincipalgenefeatureselectionfrombiggeneexpressiondata |