M3C: Monte Carlo reference-based consensus clustering
Genome-wide data is used to stratify patients into classes for precision medicine using clustering algorithms. A common problem in this area is selection of the number of clusters (K). The Monti consensus clustering algorithm is a widely used method which uses stability selection to estimate K. Howe...
Main Authors: | , , , , , , , |
---|---|
Format: | Journal article |
Language: | English |
Published: |
Springer
2020
|
_version_ | 1826262277160960000 |
---|---|
author | John, CR Watson, D Russ, D Goldmann, K Ehrenstein, M Pitzalis, C Lewis, M Barnes, M |
author_facet | John, CR Watson, D Russ, D Goldmann, K Ehrenstein, M Pitzalis, C Lewis, M Barnes, M |
author_sort | John, CR |
collection | OXFORD |
description | Genome-wide data is used to stratify patients into classes for precision medicine using clustering algorithms. A common problem in this area is selection of the number of clusters (K). The Monti consensus clustering algorithm is a widely used method which uses stability selection to estimate K. However, the method has bias towards higher values of K and yields high numbers of false positives. As a solution, we developed Monte Carlo reference-based consensus clustering (M3C), which is based on this algorithm. M3C simulates null distributions of stability scores for a range of K values thus enabling a comparison with real data to remove bias and statistically test for the presence of structure. M3C corrects the inherent bias of consensus clustering as demonstrated on simulated and real expression data from The Cancer Genome Atlas (TCGA). For testing M3C, we developed clusterlab, a new method for simulating multivariate Gaussian clusters. |
first_indexed | 2024-03-06T19:33:51Z |
format | Journal article |
id | oxford-uuid:1e5eebaf-d899-454d-b0f8-1f8a4539a54b |
institution | University of Oxford |
language | English |
last_indexed | 2024-03-06T19:33:51Z |
publishDate | 2020 |
publisher | Springer |
record_format | dspace |
spelling | oxford-uuid:1e5eebaf-d899-454d-b0f8-1f8a4539a54b2022-03-26T11:16:04ZM3C: Monte Carlo reference-based consensus clusteringJournal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:1e5eebaf-d899-454d-b0f8-1f8a4539a54bEnglishSymplectic ElementsSpringer2020John, CRWatson, DRuss, DGoldmann, KEhrenstein, MPitzalis, CLewis, MBarnes, MGenome-wide data is used to stratify patients into classes for precision medicine using clustering algorithms. A common problem in this area is selection of the number of clusters (K). The Monti consensus clustering algorithm is a widely used method which uses stability selection to estimate K. However, the method has bias towards higher values of K and yields high numbers of false positives. As a solution, we developed Monte Carlo reference-based consensus clustering (M3C), which is based on this algorithm. M3C simulates null distributions of stability scores for a range of K values thus enabling a comparison with real data to remove bias and statistically test for the presence of structure. M3C corrects the inherent bias of consensus clustering as demonstrated on simulated and real expression data from The Cancer Genome Atlas (TCGA). For testing M3C, we developed clusterlab, a new method for simulating multivariate Gaussian clusters. |
spellingShingle | John, CR Watson, D Russ, D Goldmann, K Ehrenstein, M Pitzalis, C Lewis, M Barnes, M M3C: Monte Carlo reference-based consensus clustering |
title | M3C: Monte Carlo reference-based consensus clustering |
title_full | M3C: Monte Carlo reference-based consensus clustering |
title_fullStr | M3C: Monte Carlo reference-based consensus clustering |
title_full_unstemmed | M3C: Monte Carlo reference-based consensus clustering |
title_short | M3C: Monte Carlo reference-based consensus clustering |
title_sort | m3c monte carlo reference based consensus clustering |
work_keys_str_mv | AT johncr m3cmontecarloreferencebasedconsensusclustering AT watsond m3cmontecarloreferencebasedconsensusclustering AT russd m3cmontecarloreferencebasedconsensusclustering AT goldmannk m3cmontecarloreferencebasedconsensusclustering AT ehrensteinm m3cmontecarloreferencebasedconsensusclustering AT pitzalisc m3cmontecarloreferencebasedconsensusclustering AT lewism m3cmontecarloreferencebasedconsensusclustering AT barnesm m3cmontecarloreferencebasedconsensusclustering |