Double Weighted Ensemble Clustering for Cancer Subtypes Analysis

The era of big data provides the possibility of precision medicine. The most important idea we have for cancer is to divide and treat. Theoretically, each person’s cancer should be different, so it is very necessary to make personalized treatment plans for different cancer patients. Subty...

Full description

Bibliographic Details
Main Authors: Xin Zhang, Hua Huo
Format: Article
Language:English
Published: IEEE 2022-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9756537/
_version_ 1828403324126232576
author Xin Zhang
Hua Huo
author_facet Xin Zhang
Hua Huo
author_sort Xin Zhang
collection DOAJ
description The era of big data provides the possibility of precision medicine. The most important idea we have for cancer is to divide and treat. Theoretically, each person’s cancer should be different, so it is very necessary to make personalized treatment plans for different cancer patients. Subtype analysis of cancer can be viewed as a clustering problem, while ensemble clustering techniques are widely followed for their ability to combine multiple basic clusters into potentially better and more robust clusters. However, the reliability of the present ensemble clustering methods in cancer subtype analysis still needs to be improved. Therefore, we propose a double weighted ensemble clustering method (DWEC), which first derives the similarity matrix of each base cluster based on the local weighting method, and this process can be regarded as the first weighting based on clusters. Subsequently, the objective of finding the final partitions is regarded as an optimization problem, and the similarity matrix corresponding to each base cluster is weighted twice by the block coordinate descent algorithm to solve the optimal partitions result. The best experimental results were obtained in both labeled datasets and unlabeled cancer gene datasets, validating the superiority of the method. For cancer subtype analysis, although our proposed method did not show statistically significant differences in survival distributions of several subtypes in the subtype analysis of glioblastoma multiforme. However, it performed best in the results of the temporal test for all other four cancer gene data, and therefore, we conclude that our method is more effective for cancer subtype analysis compared with other methods.
first_indexed 2024-12-10T10:17:15Z
format Article
id doaj.art-6c02f50c40f6460c83b6eee59a252d3c
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-10T10:17:15Z
publishDate 2022-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-6c02f50c40f6460c83b6eee59a252d3c2022-12-22T01:52:59ZengIEEEIEEE Access2169-35362022-01-0110414774148810.1109/ACCESS.2022.31670319756537Double Weighted Ensemble Clustering for Cancer Subtypes AnalysisXin Zhang0https://orcid.org/0000-0002-7082-1512Hua Huo1https://orcid.org/0000-0001-9545-5443School of Information Engineering, Henan University of Science and Technology, Luoyang, ChinaEngineering Technology Research Center of Big Data and Computational Intelligence, Henan University of Science and Technology, Luoyang, ChinaThe era of big data provides the possibility of precision medicine. The most important idea we have for cancer is to divide and treat. Theoretically, each person’s cancer should be different, so it is very necessary to make personalized treatment plans for different cancer patients. Subtype analysis of cancer can be viewed as a clustering problem, while ensemble clustering techniques are widely followed for their ability to combine multiple basic clusters into potentially better and more robust clusters. However, the reliability of the present ensemble clustering methods in cancer subtype analysis still needs to be improved. Therefore, we propose a double weighted ensemble clustering method (DWEC), which first derives the similarity matrix of each base cluster based on the local weighting method, and this process can be regarded as the first weighting based on clusters. Subsequently, the objective of finding the final partitions is regarded as an optimization problem, and the similarity matrix corresponding to each base cluster is weighted twice by the block coordinate descent algorithm to solve the optimal partitions result. The best experimental results were obtained in both labeled datasets and unlabeled cancer gene datasets, validating the superiority of the method. For cancer subtype analysis, although our proposed method did not show statistically significant differences in survival distributions of several subtypes in the subtype analysis of glioblastoma multiforme. However, it performed best in the results of the temporal test for all other four cancer gene data, and therefore, we conclude that our method is more effective for cancer subtype analysis compared with other methods.https://ieeexplore.ieee.org/document/9756537/Cancer subtypes analysisensemble clusteringdouble weighted ensemble clusteringsimilarity matrixentropy
spellingShingle Xin Zhang
Hua Huo
Double Weighted Ensemble Clustering for Cancer Subtypes Analysis
IEEE Access
Cancer subtypes analysis
ensemble clustering
double weighted ensemble clustering
similarity matrix
entropy
title Double Weighted Ensemble Clustering for Cancer Subtypes Analysis
title_full Double Weighted Ensemble Clustering for Cancer Subtypes Analysis
title_fullStr Double Weighted Ensemble Clustering for Cancer Subtypes Analysis
title_full_unstemmed Double Weighted Ensemble Clustering for Cancer Subtypes Analysis
title_short Double Weighted Ensemble Clustering for Cancer Subtypes Analysis
title_sort double weighted ensemble clustering for cancer subtypes analysis
topic Cancer subtypes analysis
ensemble clustering
double weighted ensemble clustering
similarity matrix
entropy
url https://ieeexplore.ieee.org/document/9756537/
work_keys_str_mv AT xinzhang doubleweightedensembleclusteringforcancersubtypesanalysis
AT huahuo doubleweightedensembleclusteringforcancersubtypesanalysis