Non-Negative Symmetric Low-Rank Representation Graph Regularized Method for Cancer Clustering Based on Score Function

As an important approach to cancer classification, cancer sample clustering is of particular importance for cancer research. For high dimensional gene expression data, examining approaches to selecting characteristic genes with high identification for cancer sample clustering is an important researc...

Full description

Bibliographic Details
Main Authors: Conghai Lu, Juan Wang, Jinxing Liu, Chunhou Zheng, Xiangzhen Kong, Xiaofeng Zhang
Format: Article
Language:English
Published: Frontiers Media S.A. 2020-01-01
Series:Frontiers in Genetics
Subjects:
Online Access:https://www.frontiersin.org/article/10.3389/fgene.2019.01353/full
_version_ 1818331414038315008
author Conghai Lu
Juan Wang
Jinxing Liu
Chunhou Zheng
Xiangzhen Kong
Xiaofeng Zhang
author_facet Conghai Lu
Juan Wang
Jinxing Liu
Chunhou Zheng
Xiangzhen Kong
Xiaofeng Zhang
author_sort Conghai Lu
collection DOAJ
description As an important approach to cancer classification, cancer sample clustering is of particular importance for cancer research. For high dimensional gene expression data, examining approaches to selecting characteristic genes with high identification for cancer sample clustering is an important research area in the bioinformatics field. In this paper, we propose a novel integrated framework for cancer clustering known as the non-negative symmetric low-rank representation with graph regularization based on score function (NSLRG-S). First, a lowest rank matrix is obtained after NSLRG decomposition. The lowest rank matrix preserves the local data manifold information and the global data structure information of the gene expression data. Second, we construct the Score function based on the lowest rank matrix to weight all of the features of the gene expression data and calculate the score of each feature. Third, we rank the features according to their scores and select the feature genes for cancer sample clustering. Finally, based on selected feature genes, we use the K-means method to cluster the cancer samples. The experiments are conducted on The Cancer Genome Atlas (TCGA) data. Comparative experiments demonstrate that the NSLRG-S framework can significantly improve the clustering performance.
first_indexed 2024-12-13T13:19:28Z
format Article
id doaj.art-66bf45d69b154dcc81f7b1848f4e3b7b
institution Directory Open Access Journal
issn 1664-8021
language English
last_indexed 2024-12-13T13:19:28Z
publishDate 2020-01-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Genetics
spelling doaj.art-66bf45d69b154dcc81f7b1848f4e3b7b2022-12-21T23:44:27ZengFrontiers Media S.A.Frontiers in Genetics1664-80212020-01-011010.3389/fgene.2019.01353496650Non-Negative Symmetric Low-Rank Representation Graph Regularized Method for Cancer Clustering Based on Score FunctionConghai Lu0Juan Wang1Jinxing Liu2Chunhou Zheng3Xiangzhen Kong4Xiaofeng Zhang5School of Information Science and Engineering, Qufu Normal University, Rizhao, ChinaSchool of Information Science and Engineering, Qufu Normal University, Rizhao, ChinaSchool of Information Science and Engineering, Qufu Normal University, Rizhao, ChinaCollege of Electrical Engineering and Automation, Anhui University, Hefei, ChinaSchool of Information Science and Engineering, Qufu Normal University, Rizhao, ChinaSchool of Information and Electrical Engineering, Ludong University, Yantai, ChinaAs an important approach to cancer classification, cancer sample clustering is of particular importance for cancer research. For high dimensional gene expression data, examining approaches to selecting characteristic genes with high identification for cancer sample clustering is an important research area in the bioinformatics field. In this paper, we propose a novel integrated framework for cancer clustering known as the non-negative symmetric low-rank representation with graph regularization based on score function (NSLRG-S). First, a lowest rank matrix is obtained after NSLRG decomposition. The lowest rank matrix preserves the local data manifold information and the global data structure information of the gene expression data. Second, we construct the Score function based on the lowest rank matrix to weight all of the features of the gene expression data and calculate the score of each feature. Third, we rank the features according to their scores and select the feature genes for cancer sample clustering. Finally, based on selected feature genes, we use the K-means method to cluster the cancer samples. The experiments are conducted on The Cancer Genome Atlas (TCGA) data. Comparative experiments demonstrate that the NSLRG-S framework can significantly improve the clustering performance.https://www.frontiersin.org/article/10.3389/fgene.2019.01353/fullcancer gene expression datalow-rank representationfeature selectionscore functionclustering
spellingShingle Conghai Lu
Juan Wang
Jinxing Liu
Chunhou Zheng
Xiangzhen Kong
Xiaofeng Zhang
Non-Negative Symmetric Low-Rank Representation Graph Regularized Method for Cancer Clustering Based on Score Function
Frontiers in Genetics
cancer gene expression data
low-rank representation
feature selection
score function
clustering
title Non-Negative Symmetric Low-Rank Representation Graph Regularized Method for Cancer Clustering Based on Score Function
title_full Non-Negative Symmetric Low-Rank Representation Graph Regularized Method for Cancer Clustering Based on Score Function
title_fullStr Non-Negative Symmetric Low-Rank Representation Graph Regularized Method for Cancer Clustering Based on Score Function
title_full_unstemmed Non-Negative Symmetric Low-Rank Representation Graph Regularized Method for Cancer Clustering Based on Score Function
title_short Non-Negative Symmetric Low-Rank Representation Graph Regularized Method for Cancer Clustering Based on Score Function
title_sort non negative symmetric low rank representation graph regularized method for cancer clustering based on score function
topic cancer gene expression data
low-rank representation
feature selection
score function
clustering
url https://www.frontiersin.org/article/10.3389/fgene.2019.01353/full
work_keys_str_mv AT conghailu nonnegativesymmetriclowrankrepresentationgraphregularizedmethodforcancerclusteringbasedonscorefunction
AT juanwang nonnegativesymmetriclowrankrepresentationgraphregularizedmethodforcancerclusteringbasedonscorefunction
AT jinxingliu nonnegativesymmetriclowrankrepresentationgraphregularizedmethodforcancerclusteringbasedonscorefunction
AT chunhouzheng nonnegativesymmetriclowrankrepresentationgraphregularizedmethodforcancerclusteringbasedonscorefunction
AT xiangzhenkong nonnegativesymmetriclowrankrepresentationgraphregularizedmethodforcancerclusteringbasedonscorefunction
AT xiaofengzhang nonnegativesymmetriclowrankrepresentationgraphregularizedmethodforcancerclusteringbasedonscorefunction